IQSS logo

IRC log for #dataverse, 2019-01-16

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
02:54 jri joined #dataverse
04:54 jri joined #dataverse
05:54 jri joined #dataverse
08:10 jri joined #dataverse
09:33 jri_ joined #dataverse
11:20 poikilotherm joined #dataverse
11:21 poikilotherm Morning guys :-)
11:37 pdurbin morning poikilotherm
11:59 poikilotherm Hi pdurbin
12:00 poikilotherm Sorry for being quiet last week till today...
12:00 poikilotherm I was really busy preparing a talk about RSE at FZJ
12:01 poikilotherm Thus I hadn't much time left to spend on Dataverse :-)
12:02 pdurbin No worries. I should really write my talk for Open Science Days in Berlin.
12:02 poikilotherm Oh and guess what: the next two weeks I will be on "vacation". Demolition work and new roof for our new home.
12:03 pdurbin Sounds relaxing. ;)
12:03 poikilotherm Totally.
12:03 poikilotherm :-D
12:04 poikilotherm At least I have some savings... Don't need to pay for a gym... ;-)
12:05 pdurbin :)
12:08 pdurbin Still thinking about going to FOSDEM?
12:20 poikilotherm Yeah. A colleague of mine is going there and we talked about sharing a car
12:20 poikilotherm Yet there are no signs of an RSE track :-(
12:21 poikilotherm Rumors seemed to be just rumors :-(
12:23 pdurbin Bummer. I'm planning on meeting up with the Open Source Design folks in Berlin. Here's their plan for FOSDEM: https://discourse.opensourcedesign.net/t/fosdem-osd-panel/752
12:26 poikilotherm The container and java devroom sound promising, too
12:31 pdurbin Some day I'd like to go.
12:52 pdurbin poikilotherm: not sure if you saw Leonid's latest comment on your EJB timer proposal: https://github.com/IQSS/dataverse/issues/5345#issuecomment-452128171
12:59 poikilotherm Thx :-) I read it some days ago, but had no chance to answer yet.
13:08 pdurbin no rush
13:09 poikilotherm :-)
13:09 pdurbin That issue seems to represent the smallest step forward in an epic that appears to be otherwise fairly blocked.
13:26 poikilotherm Yeah.
13:26 poikilotherm I am wondering if I should add a batch job for this.
13:27 poikilotherm Leaning towards using proper management stuff for this instead of just async
13:28 poikilotherm Async methods running for days sounds a bit flaky to me.
13:33 pdurbin What do you mean by proper management stuff?
13:37 poikilotherm JSR 352 aka "JBatch"
13:38 poikilotherm Part of Java EE ;-)
13:38 poikilotherm Already used for other stuff in Dataverse
13:38 poikilotherm So already a dependency
13:39 pdurbin_m joined #dataverse
13:41 pdurbin_m poikilotherm: ok, so jobs and all that. Sounds fine. Leonid is out today but you could leave a comment with this idea. I thought maybe you wanted to introduce a queue system like RabbitMQ or whatever.
13:48 poikilotherm Oh dear, that would be overkill IMHO
13:54 pdurbin_m heh. ok
13:57 juancorr joined #dataverse
14:19 pdurbin poikilotherm: how well would JBatch work with multile glassfish servers?
14:37 pdurbin multiple*
14:57 poikilotherm Hmm haven't looked into that yet. Maybe that makes things even easier for load distribution. Good point, thx :-)
14:58 donsizemore joined #dataverse
15:22 pdurbin poikilotherm: I'm maintaining a list of multiple Glassfish server gotchas at http://guides.dataverse.org/en/4.10.1/installation/advanced.html#multiple-glassfish-servers
15:23 pdurbin Just added a new one about the draft logging feature (not released yet).
15:23 pdurbin donsizemore: if you're ears were burning I was just singing your praises to Mike as I showed him the main.yml file. Thank you!
15:23 pdurbin your*
15:54 poikilotherm pdurbin I just commented on the issue. I'm off for today now. Have a good day/night.
15:54 pdurbin bye!
15:55 pdurbin donsizemore: when you have a moment, I have more Ansible questions for you. :)
16:00 donsizemore @pdurbin knocking some stuff off my plate then including sample data for you
16:00 donsizemore @pdurbin (and making use of the queries from the wonderful metrics API!)
16:01 pdurbin Ah, I'm glad you like that new API. Is your plan to just run my "bird and trees" scripts for now?
16:02 donsizemore @pdurbin first i'm agreeing to a beer tasting friday after work. but you said nobody wants birds or trees... so... imaginary researchers and research projects?
16:03 pdurbin Sure, if it's not too much work for you.
16:03 pdurbin I would just suggest starting small.
16:05 pdurbin If you could make it configurable, that would be ideal. Different profiles of sample data, if that makes sense.
16:05 pdurbin And I would say no sample data by default.
16:19 donsizemore i'll leave it off by default and provide a sample json of each type, but further configuration may be best left to the individual =)
16:29 pameyer yeah - having test data creation mixed in with system config seems a little odd.  not a problem, but maybe counter-intuitive
16:33 pdurbin Well, there's a lot of demand to not have a "barren installation" when designers spin up Dataverse.
16:34 pdurbin It's basically the main reason why they don't see much value in using the EC2 scripts right now.
16:36 pameyer I'm not saying not to do it.  I'm saying mixing difference conceptual steps in the dataverse-ec2-easy-button seems counter-intuitive to me
16:36 pameyer but I'm also not going to be using it, so that's not a problem
16:41 pdurbin I think we have to do it. And it's a good thing. :)
16:42 pdurbin We want the design team to do more vetting of pull requests.
16:42 pdurbin Keep the developers honest. Did the build what we asked them to build?
17:12 donsizemore joined #dataverse
17:22 donsizemore @pdurbin should i include reference_data.sql?
17:28 pdurbin donsizemore: I don't understand. The SWORD API, for example, doesn't work if you don't run that SQL script. It's required.
17:42 donsizemore ah, here it is in the role. i was pecking through the entries in your "post" script
17:45 pdurbin cool
18:23 bricas I'm getting a very unspecified error when I login as my dataverseAdmin account
18:23 bricas Internal Server Error - An unexpected error was encountered, no more information is available.
18:23 pdurbin a 500 error?
18:23 bricas ya
18:23 pdurbin anything in server.log?
18:24 bricas A system exception occurred during an invocation on EJB DatasetVersionServiceBean, method: public void edu.harvard.iq.dataverse.DatasetVersionSer​viceBean.populateDatasetSearchCard(edu.har​vard.iq.dataverse.search.SolrSearchResult)
18:25 bricas hrmmmm
18:26 bricas java.lang.IllegalArgumentException: Failed to parse identifier: doi:/10.25545/T3R1S0
18:26 pdurbin Weird. What version of Dataverse are you running?
18:26 bricas 4.10.1
18:26 bricas latest and greatest
18:26 pdurbin Ok, did you upgrade recently? Today?
18:26 bricas last week
18:27 pdurbin 4.10 required a new Solr schema file. Is it in place?
18:27 bricas should be. will confirm.
18:27 pdurbin Ok, the release notes also say to reindex.
18:28 pdurbin That doi does look strange. There shouldn't be a slash after the colon. Which version of Dataverse did you upgrade from?
18:29 bricas 4.10
18:29 pdurbin Huh. That was a pretty small jump.
18:31 pdurbin Can you please email your server.log file to support@dataverse.org? Or at least the entire stack trace? And can you please mention in the email that you're running 4.10.1?
18:37 bricas schema is current btw
18:37 pdurbin good
18:37 bricas issue not limited to dataverseAdmin btw, another user is seeing it
18:38 pdurbin yuck
18:38 bricas i can login with my own account and it's fine though
18:38 pdurbin How long were you on 4.10?
18:39 bricas never tried then :/
18:39 pdurbin Ok. Were you on 4.9.4 before that? For a while with no issues?
18:44 bricas as far as i know -- typically i don't login :)
18:47 pdurbin Ok, so it could have been the change from 4.9.4. You already reindexed? The release notes say to reindex in place but you only have 8 datasets at https://dataverse.lib.unb.ca right? So you can probably go for the "clear and reindex" option at http://guides.dataverse.org/en/4.10.1/admin/solr-search-index.html which clears out more stuff.
18:48 bricas i re-indexed, but i'll try clear option.
18:49 bricas having an empty index lets the page display btw
18:49 bricas of course it says nothing exists
18:49 bricas but at least it's not an error
18:50 pdurbin :)
18:50 bricas error after re-index
18:51 pameyer which solr version are you on?
18:52 bricas 7.3.1
18:53 pdurbin no silver bullet. I'm starting to worry about your database. You could look at your dvobject table: http://phoenix.dataverse.org/schemaspy/latest/tables/dvobject.html ... especially protocol, authority, and identifier
18:54 pdurbin bricas: when you see the error, what's the URL in your browser? Something with mydata?
18:54 pdurbin (MyData is powered by Solr.)
18:54 bricas dataverse.xhtml
18:55 pdurbin Ok, so the homepage, not mydata. Hmm.
18:55 bricas myData loads fine atually
18:55 bricas dataverseuser.xhtml?selectTab=dataRelatedToMe
18:55 pdurbin yeah, data related to me is the old name for mydata
18:56 pdurbin Did you get a ticket number when you emailed the stacktrace to us?
18:56 bricas still haven't hit sent. sec.
18:56 pdurbin The line numbers in the stacktrace will help us see where exactly the IllegalArgumentException is being thrown.
18:57 pdurbin Any errors on page 2? https://dataverse.lib.unb.ca/?page=2
18:59 bricas nope
18:59 pdurbin ok, so it's one of the 10 items on page 1, but which one?
18:59 pdurbin pameyer: this is reminding me of the problem in production (Harvard Dataverse) during caga tio.
19:00 jri joined #dataverse
19:00 bricas pdurbin: #271322
19:00 pameyer pdurbin: could be
19:00 pameyer for what it's worth, I don't see any errors at https://dataverse.lib.unb.ca
19:00 pdurbin me neither
19:01 pameyer and it looks like it's got datasets and dataverses in it
19:01 bricas yeah, it's only for certain logged in users.
19:02 pameyer pdurbin: that one was something unexpected in postgres, right?
19:02 pameyer also only showed up for admin users when logged in
19:03 pdurbin pameyer: the difference is that the fix for us was to reindex the problematic dataset
19:03 pameyer pdurbin: implying that reindexing everything (for bricas) should've fixed it, if it was the same problem
19:03 pameyer did it need a db edit prior to reindex?
19:03 pdurbin pameyer: see https://iqss.slack.com/archives/G4CET790V/p1545241059376400
19:04 pdurbin no db edit needed
19:04 pdurbin might need a different fix for bricas
19:05 pameyer right
19:06 pdurbin I guess we know which dataset is having trouble. bricas can you navigate to https://dataverse.lib.unb.ca/dataset.xhtml?persistentId=doi:10.25545/T3R1S0 which I assue is unpublished?
19:07 bricas pdurbin: storage identier seems odd in the dvobject table
19:07 bricas file:///10.25545/73EVQ0
19:07 bricas is the first entry for eg
19:07 bricas a number of ids similar to 1633ba1245a-3821dc99e33c also in there
19:09 pdurbin Here's where the exception is being thrown: https://github.com/IQSS/dataverse/blob/v4.10.1/src/main/java/edu/harvard/iq/dataverse/DatasetVersionServiceBean.java#L851
19:09 pdurbin called by https://github.com/IQSS/dataverse/blob/v4.10.1/src/main/java/edu/harvard/iq/dataverse/search/SearchIncludeFragment.java#L410
19:10 pdurbin bricas: can you navigate to that dataset.xhtml URL above?
19:10 bricas sec
19:11 bricas not found
19:11 pdurbin interesting
19:12 bricas file:///10.25545/T3R1S0
19:12 pdurbin and when you ran a "clear" solr was really empty? You can compare before and after with this script: https://github.com/IQSS/dataverse/blob/v4.10.1/scripts/search/query
19:14 bricas checking
19:14 bricas "response":{"numFound":0,"start":0,"docs":[]
19:15 pdurbin ok, quite clear
19:16 pdurbin Is T3R1S0 in your dvobject table? It would be under identifier.
19:17 bricas 10.25545/T3R1S0
19:17 bricas is there
19:17 bricas a number of the other rows don't have the prefix though
19:18 pdurbin yeah, the prefix shouldn't be there
19:19 pdurbin What do you have in the publicationdate column for that dataset? I assume it's unpublished.
19:20 bricas there were two entries with the prefix, so i fixed them both.
19:20 bricas no diff on re-index
19:21 pdurbin If you run that `query` script, you probably see T3R1S0 in the output. Right?
19:21 bricas blank pub date
19:22 bricas yep, i see it in the solr output
19:22 pdurbin Does it look weird in the Solr output, with the slash after the colon?
19:23 bricas https://nopaste.xyz/?a3e44472f295c95f#L7R3nJF5OTR8dBkg92hanSFsn9Ux3lzZrybS+50YdhI=
19:23 pdurbin Oh, also, what's the database id of the dataset? "id" in the dvobject table?
19:24 bricas 206
19:25 pdurbin Ok, can you try a trick like this to navigate to the broken dataset by database id? https://dataverse.harvard.edu/dataset.xhtml?id=3035124
19:26 bricas yep. i'm in.
19:26 pdurbin Can you delete it? :)
19:27 pdurbin The Solr output is crazy, by the way. Stuff like this: "dsPersistentId":"doi:/T3R1S0",
19:27 bricas ah-ha. the person who created this is also the other user having the display issue.
19:28 pdurbin This dataset feels sad and broken from my perspective. Perhaps a fresh one is in order. As long as the reseacher hasn't shared the DOI around and it's not too much work to recreate.
19:29 bricas will ask.
19:30 pdurbin ok. did you compare the "authority" column in the dvobject table?
19:30 bricas it's blank
19:30 pdurbin for non-broken datasets, it's not blank, right?
19:31 bricas seems like it.
19:31 bricas perhaps i can just add the prefix there
19:32 pdurbin worth a shot
19:33 bricas bingo
19:33 bricas had to fix the other bad entry too
19:33 pdurbin Dataset surgery like this shouldn't be necessary. I'm still wondering what happened.
19:34 bricas the other entry was a dataset created today
19:34 bricas by a different person
19:35 pdurbin What if you create a new dataset right now? Same problem?
19:35 bricas wil try. sec.
19:38 bricas so it creates, but goes to a page not found error
19:38 bricas dataset.xhtml?persistentId=doi%3A%​2F10.25545%2FPDBZ7V&version=DRAFT
19:41 bricas 209 | Dataset   | 2019-01-16 15:38:25.481 | 2019-01-16 15:38:25.571 | 2019-01-16 15:38:25.481 | 2019-01-16 15:38:25.64  | 2019-01-16 15:38:25.481    | f                     |                         |          1 |        2 |                | file:///10.25545/PDBZ7V   |           |                         | f                    | 10.25545/PDBZ7V | doi
19:41 pdurbin no error on the demo site. I get " Success! – This dataset has been created " after creating one and landing at https://demo.dataverse.org/dataset.xhtml?persistentId=doi%3A10.5072%2FFK2%2FTQMT8Y&version=DRAFT
19:42 pdurbin bricas: but maybe I'm doing something different. I'd be very curious to hear if you can reproduce the bug on the demo site, which is also running Dataverse 4.10.1.
19:44 bricas nope. can't reproduce there
19:44 pdurbin Ok, so what's different about your installation vs demo?
19:44 pameyer which provider is demo using?
19:44 bricas i'm seeing an extra slash in our doi
19:45 bricas Cassidy, Brian, 2019, "Test Dataverse for Brian", https://doi.org//10.25545/PDBZ7V, UNB, DRAFT VERSION
19:45 bricas is this a config issue
19:45 pdurbin I don't know. Possibly?
19:46 pdurbin I'm seeing 'Please note that the authority cannot have a slash (“/”) in it' at http://guides.dataverse.org/en/4.10.1/installation/config.html#authority
19:48 pdurbin bricas: does yours have a slash? I'm looking at https://github.com/IQSS/dataverse/issues/4718 and https://github.com/IQSS/dataverse/issues/5057
19:49 bricas {"status":"ERROR","message":"Setting Authority not found"}⏎
19:50 pdurbin that's no good
19:50 pdurbin when I run curl http://localhost:8080/api/admin/settings/:Authority
19:50 pdurbin I get {"status":"OK","data":{"message":"10.5072"}}
19:51 bricas added one. will re-try my dataset.
19:51 pameyer yeah - `:Authority` , not `Authority`
19:51 pdurbin pameyer: I had the same thought but try it. It's weird.
19:51 bricas i did it with the :
19:51 bricas anyway, it's there now -- retesting!
19:53 bricas presto change-o. works.
19:53 bricas except! now i get: https://doi.org/10.25545/10.25545/8MVOXK,
19:54 pdurbin yuck
19:54 bricas shoulder: {"status":"OK","data":{"message":"10.25545/"}}⏎
19:55 pdurbin good find
19:57 bricas i've no clue what that is :)
19:58 pdurbin bricas: I believe you're suffing from https://github.com/IQSS/dataverse/issues/5057 . See also the linked post which is probably a little more human readable: https://groups.google.com/d/msg/dataverse-community/Ivh2137nESc/6-kiMU7GBwAJ
20:00 bricas yep. exactly.
20:00 bricas shoulder deleted.
20:00 pdurbin Once things are sorted, can you please leave a comment on that issue?
20:01 bricas on 5057?
20:01 pdurbin yes, please
20:02 pdurbin please feel free to warm up that google groups thread as well
20:09 bricas as an aside, i had to make sure our template was xml compliant for an actual error message to show up :)
20:10 pdurbin huh, ok :)
20:14 pdurbin I guess it's because Dataverse is a JSF app and JSF requires XHTML? Not sure.
20:24 bricas *shrug*
20:25 pdurbin :)
21:52 jri joined #dataverse
22:52 jri joined #dataverse
23:52 jri joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.