IQSS logo

IRC log for #dataverse, 2018-04-17

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:08 donsizemore joined #dataverse
04:02 tyrel joined #dataverse
07:00 jri joined #dataverse
07:03 jri joined #dataverse
10:45 donsizemore joined #dataverse
11:26 donsizemore joined #dataverse
11:30 pdurbin donsizemore: mornin. Finger on the trigger?
11:31 donsizemore @pdurbin hay hay. i'm all set. mandy's going to sit with me during the upgrade so i'm waiting for her
11:31 pdurbin No one else is awake this early.
11:32 donsizemore if i were doing it myself i'd have started around 0530 at my coffee table. i can no longer sleep in my old age
11:33 donsizemore i also meant to ask them if they had notified our users. nothing like a surprise upgrade ;)
11:33 pdurbin the best kind
11:35 pdurbin Where you put "Welcome to the UNC Dataverse" in orange is where we put upgrade notices.
11:35 donsizemore i just have a "check back soon" 503 page
11:35 pdurbin You can make it clickable with a more detailed message in a popup.
11:35 donsizemore i'll point out your notices to Mandy. poop we should've used that
12:30 donsizemore joined #dataverse
12:31 donsizemore @pdurbin and i'm glad i didn't start the upgrade early. the heating system triggered the fire alarm in our building!
12:44 pdurbin_m joined #dataverse
12:44 pdurbin_m saved by the bell
13:17 donsizemore @pdurbin we're on 4.8.6 =)
13:52 cdsp-rmo joined #dataverse
14:02 cdsp-rmo hello world :)
14:06 cdsp-rmo I'm looking for someone who knows if it IS possible to import xml ddi files into dataverse or not. I've seen subjects about it on the google group, but I don't understand the result of this discussions. It seems some people wanted to do script to do it, but there is no more information on that ... I've also seen a way to do it with dataverse, something with dataverse 3 to 4 exports, but I can't figure out how to make it work (I simple get a
14:07 cdsp-rmo Given that I've found nothing, I intend to do my own ddi to dataverse import script, but before that, I'd like to know if I'm doing this for "nothing" :S
14:08 pdurbin cdsp-rmo: importing DDI XML is the supported method for migrating from DVN 3 to Dataverse 4.
14:09 pdurbin One of your messages got cut off at "I simple get a"
14:10 cdsp-rmo *I simply get a request answer like "WORKFLOW_IN_PROGRESS", and nothing happening
14:11 pdurbin hmm
14:11 cdsp-rmo hum ... maybe there is something I do wrong here :S
14:13 cdsp-rmo I used an api function, apit/batch/migrate thingy
14:13 pdurbin cdsp-rmo: this commit may or may not be helpful or interesting: https://github.com/IQSS/dataverse/commit/b8090f0
14:14 pdurbin It's where I got my old "roundTripDdi" method to be not completely broken.
14:14 cdsp-rmo yes, the api I used in in there
14:14 cdsp-rmo *is
14:15 cdsp-rmo it wasn't an apitoken or parentDataverse problem
14:15 pdurbin ok
14:15 cdsp-rmo I suspect the path given for files
14:15 pdurbin Can we back up and talk more about what your goal is?
14:15 cdsp-rmo but the "response" in not very helpful
14:15 cdsp-rmo yep !
14:16 cdsp-rmo I'm looking to import datasets in dataverse, with NESSTAR ddi export files
14:16 pdurbin ah, ok
14:18 pdurbin wow, there is a super long thread on this topic at https://groups.google.com/d/msg/dataverse-community/qsY8swD9Hh0/dnYi61g1AAAJ
14:18 pdurbin 38 posts on that thread
14:18 cdsp-rmo yes, I've read this
14:18 pdurbin better you than me ;)
14:18 cdsp-rmo the thing is, they talk about doing some "scripts" about doing their import
14:19 cdsp-rmo so I guessed there was a problem for importing into dataverse directly
14:19 cdsp-rmo but it was in 2017, and the 2 "newer" posts don't mention that
14:20 pdurbin I'm not sure what their scripts do. Once the XML file is ready it should just be a curl command.
14:20 cdsp-rmo ah
14:20 cdsp-rmo I've seen you gave them json samples of datasets
14:21 cdsp-rmo I guessed they wanted to convert their xml to json then import to dataverse
14:21 pdurbin oh
14:21 cdsp-rmo I think I missed the xml import thing on dataverse
14:22 pdurbin again, XML is the supported way to migrate data into Dataverse
14:22 pdurbin and check this out
14:22 pdurbin I'm expecting "WORKFLOW_IN_PROGRESS" in my test: https://github.com/IQSS/dataverse/blob/b8090f078c28b7c571f76127e7d149f6b4ff73b1/src/test/java/edu/harvard/iq/dataverse/api/BatchImportIT.java#L71
14:22 cdsp-rmo hum
14:22 pdurbin cdsp-rmo: but you're saying you see "WORKFLOW_IN_PROGRESS" but then nothing happens?
14:23 cdsp-rmo yes
14:23 cdsp-rmo if the api token is wrong, I do have an error
14:23 cdsp-rmo if the parent dataverse is wrong, I also do have an error
14:23 pdurbin It looks like I then sleep for a couple seconds and then export the dataset I just imported.
14:23 cdsp-rmo for the file path, I can write anything, I always get Workflow_in_progress
14:24 cdsp-rmo I've seen a doc on this api, can't find it anymore
14:24 cdsp-rmo the process for exporting dataverse 3 to 4
14:24 pdurbin then I check to make sure the title of the exported dataset (the one I just created by migrating it) matches the title in the DDI XML I used to migrate it.
14:25 pdurbin exportDatasetAsDdi.then().assertThat()
14:25 pdurbin .body("codeBook.docDscr.citation.titlStmt.titl", CoreMatchers.equalTo("Black Professional Women, 1969"))
14:26 pdurbin so it should work
14:26 pdurbin but I'm not sure if this API is actually documented
14:31 cdsp-rmo joined #dataverse
14:31 cdsp-rmo sorry, power failure :(
14:32 pdurbin !
14:32 pdurbin I assumed it was this memory leak in Shout: https://github.com/IQSS/chat.dataverse.org/issues/3 :)
14:34 cdsp-rmo sorry what ?
14:34 cdsp-rmo :D
14:34 cdsp-rmo we really had a power failure ^^
14:34 pdurbin yikes
14:35 cdsp-rmo ah nevermind, I misunderstood
14:35 cdsp-rmo sorry :D
14:35 pdurbin no worries
14:35 pdurbin did you see what I was saying about the title?
14:35 cdsp-rmo yes !
14:35 pdurbin good
14:35 pdurbin my point is that something is working from my tests
14:35 pdurbin the dataset is imported at least
14:36 pdurbin able to be exported
14:36 pdurbin with the same title :)
14:37 pdurbin cdsp-rmo: if you can crack this nut you should definitely start a new thread on the google group saying how you got nesstar datasets imported into Dataverse. There seems to be a lot of interest in this.
14:39 cdsp-rmo well, in a kinda ugly way, with python scripts and converting ddi xml to json dataset format
14:40 cdsp-rmo but yes, that may interest some people
14:40 pdurbin oh, I thought you were using XML
14:40 cdsp-rmo well, I tried actually
14:40 cdsp-rmo but nothing got imported, that's the thing
14:40 cdsp-rmo so I started to do my own script to do so
14:41 pdurbin huh, I wonder why my "import with XML" test worked
14:41 cdsp-rmo I think it's related to the path to the files
14:41 cdsp-rmo I'm trying to figure out what this path "should" be
14:41 pdurbin ok
14:42 cdsp-rmo (but the strange thing is that there is no "errors" when the path is wrong)
14:42 pdurbin :(
14:42 pdurbin Would you rather import using DDI XML or (Dataverse native) JSON?
14:43 cdsp-rmo given that I have DDI XML right now, it would be DDI
14:43 pdurbin ok, again, should work
14:43 pdurbin any weird errors in server.log?
14:43 cdsp-rmo it seems not
14:43 cdsp-rmo but I'm not the admin, so I can't access it again right now
14:44 pdurbin I'd suggest creating an example DDI XML file you don't mind sharing and attaching it to a new GitHub issue you create.
14:44 cdsp-rmo okay
14:45 pdurbin If you mention NESSTAR in the issue title, people will be extra interested. :)
14:45 cdsp-rmo ^^
14:49 cdsp-rmo ah, I've found the doc I used
14:49 cdsp-rmo https://github.com/IQSS/dataverse/blob/develop/scripts/migration/migration_instructions.txt#L31
14:49 pdurbin thanks
14:49 cdsp-rmo and it was said that "the status of the job is viewable in the import-log file"
14:49 cdsp-rmo file we couldn't find
14:50 cdsp-rmo :(
14:50 pdurbin oh! one sec
14:50 * cdsp-rmo is not moving a bit
14:51 pdurbin I'm looking around in glassfish4/glassfish/domains/domain1/logs
14:51 pdurbin validationLog2018...
14:51 pdurbin cleanupLog2018
14:55 pdurbin cdsp-rmo: do you have any files that begin with that? validation or cleanup?
14:56 pdurbin The validationLog is created here, for example: https://github.com/IQSS/dataverse/blob/v4.8.6/src/main/java/edu/harvard/iq/dataverse/api/BatchServiceBean.java#L50
14:56 cdsp-rmo I have to ask that, I don't have the permissions for that
14:58 pdurbin ok
14:58 pdurbin here's an example validation log entry on my laptop: Import Exception processing file batchImportDv/version1.xml, msg:VersionNumber 1 already exists in dataset hdl:1902.1/00012
14:59 cdsp-rmo what was the curl used ?
14:59 cdsp-rmo and where was the file, and which path was given ?
14:59 pdurbin all of the cleanupLog files on my laptop are empty
15:00 cdsp-rmo :S
15:00 pdurbin I probably wasn't using curl. I was probably running "UtilIT.migrateDataset": https://github.com/IQSS/dataverse/blob/b8090f078c28b7c571f76127e7d149f6b4ff73b1/src/test/java/edu/harvard/iq/dataverse/api/BatchImportIT.java#L68
15:01 cdsp-rmo oh right
15:02 pdurbin here's migrateDataset: https://github.com/IQSS/dataverse/blob/b8090f078c28b7c571f76127e7d149f6b4ff73b1/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java#L316
15:07 cdsp-rmo I ended up here
15:07 cdsp-rmo https://github.com/IQSS/dataverse/blob/ffc91db0280ba9cac21d97db15d3a2dab8acc085/src/main/java/edu/harvard/iq/dataverse/api/BatchImport.java
15:08 cdsp-rmo ah, it seems something happened
15:08 pdurbin \o/
15:08 cdsp-rmo we have some cleanupLog and validationLog files
15:09 pdurbin jri is giving you access? :)
15:09 cdsp-rmo yep
15:09 pdurbin jri++
15:09 cdsp-rmo Unexpected Error in handleFile(), file:temp/export.xmlmessage: temp/export.xml, caused by: null at line:
15:09 pdurbin huh, maybe the path is wrong?
15:10 pdurbin that's just a first thought
15:10 cdsp-rmo right now, it's completly right because we removed the test files from the server
15:10 cdsp-rmo ^^
15:10 pdurbin ah, ok
15:10 pdurbin standup in 2 minutes
15:10 cdsp-rmo gonna try some things to match with a "good" path
15:11 cdsp-rmo hope it will be good
15:14 pdurbin me too. no standup today, it turns out
15:17 pameyer joined #dataverse
15:24 cdsp-rmo the hardest thing is to understand what is the root of all of this
15:24 cdsp-rmo the basedir for the path
15:24 cdsp-rmo :(
15:25 pdurbin :(
15:26 pdurbin cdsp-rmo: I think you should create a GitHub issue so we can at least document what to do.
15:26 pdurbin It's so confusing.
15:26 cdsp-rmo yes, sorry
15:27 cdsp-rmo I'm so much into it I forgot how strange the subject is ^^
15:47 donsizemore @pdurbin tell leonid i've nearly completed re-indexing odum's production dataverse in place and i'm only using 11G of active memory!
15:51 cdsp-rmo posted my issue here : https://github.com/IQSS/dataverse/issues/4593
15:51 cdsp-rmo shall I mention our chat and link the irc log too ?
15:58 pdurbin cdsp-rmo: please do!
15:58 pdurbin donsizemore: awesome. He's on vacation though.
16:19 cdsp-rmo well, we did some tests, and this time we have nothing in the log files
16:19 cdsp-rmo and nothing on the dataverse
16:19 cdsp-rmo :(
16:20 cdsp-rmo gotta go, so I stop the tests, but it's really strange
16:23 cdsp-rmo have a nice day ! o/
16:39 jri joined #dataverse
16:53 donsizemore joined #dataverse
16:57 pameyer joined #dataverse
17:00 pameyer cdsp-rmo: a few months ago, I took a look into importing datasets from ddi xml as part of some migration work.  For what it's worth, I wasn't able to get that approach working.
17:01 pameyer since the datasets I was working on migrating weren't in ddi/xml, I switched approaches
17:02 pameyer donsizemore: congrats on 4.8.6
17:10 pdurbin jri: you'll have to thank rmo for opening https://github.com/IQSS/dataverse/issues/4593 for me
18:00 pameyer joined #dataverse
18:23 donsizemore joined #dataverse
19:50 jri joined #dataverse
20:29 pameyer joined #dataverse
20:30 pdurbin pameyer: when you have a moment, please look at https://github.com/IQSS/dataverse/issues/4396
20:32 pameyer pdurbin: will do
20:32 pdurbin thanks!
23:28 jri joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.