Time
S
Nick
Message
00:08
donsizemore joined #dataverse
04:02
tyrel joined #dataverse
07:00
jri joined #dataverse
07:03
jri joined #dataverse
10:45
donsizemore joined #dataverse
11:26
donsizemore joined #dataverse
11:30
pdurbin
donsizemore: mornin. Finger on the trigger?
11:31
donsizemore
@pdurbin hay hay. i'm all set. mandy's going to sit with me during the upgrade so i'm waiting for her
11:31
pdurbin
No one else is awake this early.
11:32
donsizemore
if i were doing it myself i'd have started around 0530 at my coffee table. i can no longer sleep in my old age
11:33
donsizemore
i also meant to ask them if they had notified our users. nothing like a surprise upgrade ;)
11:33
pdurbin
the best kind
11:35
pdurbin
Where you put "Welcome to the UNC Dataverse" in orange is where we put upgrade notices.
11:35
donsizemore
i just have a "check back soon" 503 page
11:35
pdurbin
You can make it clickable with a more detailed message in a popup.
11:35
donsizemore
i'll point out your notices to Mandy. poop we should've used that
12:30
donsizemore joined #dataverse
12:31
donsizemore
@pdurbin and i'm glad i didn't start the upgrade early. the heating system triggered the fire alarm in our building!
12:44
pdurbin_m joined #dataverse
12:44
pdurbin_m
saved by the bell
13:17
donsizemore
@pdurbin we're on 4.8.6 =)
13:52
cdsp-rmo joined #dataverse
14:02
cdsp-rmo
hello world :)
14:06
cdsp-rmo
I'm looking for someone who knows if it IS possible to import xml ddi files into dataverse or not. I've seen subjects about it on the google group, but I don't understand the result of this discussions. It seems some people wanted to do script to do it, but there is no more information on that ... I've also seen a way to do it with dataverse, something with dataverse 3 to 4 exports, but I can't figure out how to make it work (I simple get a
14:07
cdsp-rmo
Given that I've found nothing, I intend to do my own ddi to dataverse import script, but before that, I'd like to know if I'm doing this for "nothing" :S
14:08
pdurbin
cdsp-rmo: importing DDI XML is the supported method for migrating from DVN 3 to Dataverse 4.
14:09
pdurbin
One of your messages got cut off at "I simple get a"
14:10
cdsp-rmo
*I simply get a request answer like "WORKFLOW_IN_PROGRESS", and nothing happening
14:11
pdurbin
hmm
14:11
cdsp-rmo
hum ... maybe there is something I do wrong here :S
14:13
cdsp-rmo
I used an api function, apit/batch/migrate thingy
14:13
pdurbin
cdsp-rmo: this commit may or may not be helpful or interesting: https://github.com/IQSS/dataverse/commit/b8090f0
14:14
pdurbin
It's where I got my old "roundTripDdi" method to be not completely broken.
14:14
cdsp-rmo
yes, the api I used in in there
14:14
cdsp-rmo
*is
14:15
cdsp-rmo
it wasn't an apitoken or parentDataverse problem
14:15
pdurbin
ok
14:15
cdsp-rmo
I suspect the path given for files
14:15
pdurbin
Can we back up and talk more about what your goal is?
14:15
cdsp-rmo
but the "response" in not very helpful
14:15
cdsp-rmo
yep !
14:16
cdsp-rmo
I'm looking to import datasets in dataverse, with NESSTAR ddi export files
14:16
pdurbin
ah, ok
14:18
pdurbin
wow, there is a super long thread on this topic at https://groups.google.com/d/msg/dataverse-community/qsY8swD9Hh0/dnYi61g1AAAJ
14:18
pdurbin
38 posts on that thread
14:18
cdsp-rmo
yes, I've read this
14:18
pdurbin
better you than me ;)
14:18
cdsp-rmo
the thing is, they talk about doing some "scripts" about doing their import
14:19
cdsp-rmo
so I guessed there was a problem for importing into dataverse directly
14:19
cdsp-rmo
but it was in 2017, and the 2 "newer" posts don't mention that
14:20
pdurbin
I'm not sure what their scripts do. Once the XML file is ready it should just be a curl command.
14:20
cdsp-rmo
ah
14:20
cdsp-rmo
I've seen you gave them json samples of datasets
14:21
cdsp-rmo
I guessed they wanted to convert their xml to json then import to dataverse
14:21
pdurbin
oh
14:21
cdsp-rmo
I think I missed the xml import thing on dataverse
14:22
pdurbin
again, XML is the supported way to migrate data into Dataverse
14:22
pdurbin
and check this out
14:22
pdurbin
I'm expecting "WORKFLOW_IN_PROGRESS" in my test: https://github.com/IQSS/dataverse/blob/b8090f078c28b7c571f76127e7d149f6b4ff73b1/src/test/java/edu/harvard/iq/dataverse/api/BatchImportIT.java#L71
14:22
cdsp-rmo
hum
14:22
pdurbin
cdsp-rmo: but you're saying you see "WORKFLOW_IN_PROGRESS" but then nothing happens?
14:23
cdsp-rmo
yes
14:23
cdsp-rmo
if the api token is wrong, I do have an error
14:23
cdsp-rmo
if the parent dataverse is wrong, I also do have an error
14:23
pdurbin
It looks like I then sleep for a couple seconds and then export the dataset I just imported.
14:23
cdsp-rmo
for the file path, I can write anything, I always get Workflow_in_progress
14:24
cdsp-rmo
I've seen a doc on this api, can't find it anymore
14:24
cdsp-rmo
the process for exporting dataverse 3 to 4
14:24
pdurbin
then I check to make sure the title of the exported dataset (the one I just created by migrating it) matches the title in the DDI XML I used to migrate it.
14:25
pdurbin
exportDatasetAsDdi.then().assertThat()
14:25
pdurbin
.body("codeBook.docDscr.citation.titlStmt.titl", CoreMatchers.equalTo("Black Professional Women, 1969"))
14:26
pdurbin
so it should work
14:26
pdurbin
but I'm not sure if this API is actually documented
14:31
cdsp-rmo joined #dataverse
14:31
cdsp-rmo
sorry, power failure :(
14:32
pdurbin
!
14:32
pdurbin
I assumed it was this memory leak in Shout: https://github.com/IQSS/chat.dataverse.org/issues/3 :)
14:34
cdsp-rmo
sorry what ?
14:34
cdsp-rmo
:D
14:34
cdsp-rmo
we really had a power failure ^^
14:34
pdurbin
yikes
14:35
cdsp-rmo
ah nevermind, I misunderstood
14:35
cdsp-rmo
sorry :D
14:35
pdurbin
no worries
14:35
pdurbin
did you see what I was saying about the title?
14:35
cdsp-rmo
yes !
14:35
pdurbin
good
14:35
pdurbin
my point is that something is working from my tests
14:35
pdurbin
the dataset is imported at least
14:36
pdurbin
able to be exported
14:36
pdurbin
with the same title :)
14:37
pdurbin
cdsp-rmo: if you can crack this nut you should definitely start a new thread on the google group saying how you got nesstar datasets imported into Dataverse. There seems to be a lot of interest in this.
14:39
cdsp-rmo
well, in a kinda ugly way, with python scripts and converting ddi xml to json dataset format
14:40
cdsp-rmo
but yes, that may interest some people
14:40
pdurbin
oh, I thought you were using XML
14:40
cdsp-rmo
well, I tried actually
14:40
cdsp-rmo
but nothing got imported, that's the thing
14:40
cdsp-rmo
so I started to do my own script to do so
14:41
pdurbin
huh, I wonder why my "import with XML " test worked
14:41
cdsp-rmo
I think it's related to the path to the files
14:41
cdsp-rmo
I'm trying to figure out what this path "should" be
14:41
pdurbin
ok
14:42
cdsp-rmo
(but the strange thing is that there is no "errors" when the path is wrong)
14:42
pdurbin
:(
14:42
pdurbin
Would you rather import using DDI XML or (Dataverse native) JSON ?
14:43
cdsp-rmo
given that I have DDI XML right now, it would be DDI
14:43
pdurbin
ok, again, should work
14:43
pdurbin
any weird errors in server.log?
14:43
cdsp-rmo
it seems not
14:43
cdsp-rmo
but I'm not the admin, so I can't access it again right now
14:44
pdurbin
I'd suggest creating an example DDI XML file you don't mind sharing and attaching it to a new GitHub issue you create.
14:44
cdsp-rmo
okay
14:45
pdurbin
If you mention NESSTAR in the issue title, people will be extra interested. :)
14:45
cdsp-rmo
^^
14:49
cdsp-rmo
ah, I've found the doc I used
14:49
cdsp-rmo
https://github.com/IQSS/dataverse/blob/develop/scripts/migration/migration_instructions.txt#L31
14:49
pdurbin
thanks
14:49
cdsp-rmo
and it was said that "the status of the job is viewable in the import-log file"
14:49
cdsp-rmo
file we couldn't find
14:50
cdsp-rmo
:(
14:50
pdurbin
oh! one sec
14:50
* cdsp-rmo
is not moving a bit
14:51
pdurbin
I'm looking around in glassfish4/glassfish/domains/domain1/logs
14:51
pdurbin
validationLog2018...
14:51
pdurbin
cleanupLog2018
14:55
pdurbin
cdsp-rmo: do you have any files that begin with that? validation or cleanup?
14:56
pdurbin
The validationLog is created here, for example: https://github.com/IQSS/dataverse/blob/v4.8.6/src/main/java/edu/harvard/iq/dataverse/api/BatchServiceBean.java#L50
14:56
cdsp-rmo
I have to ask that, I don't have the permissions for that
14:58
pdurbin
ok
14:58
pdurbin
here's an example validation log entry on my laptop: Import Exception processing file batchImportDv/version1.xml, msg:VersionNumber 1 already exists in dataset hdl:1902.1/00012
14:59
cdsp-rmo
what was the curl used ?
14:59
cdsp-rmo
and where was the file, and which path was given ?
14:59
pdurbin
all of the cleanupLog files on my laptop are empty
15:00
cdsp-rmo
:S
15:00
pdurbin
I probably wasn't using curl. I was probably running "UtilIT.migrateDataset": https://github.com/IQSS/dataverse/blob/b8090f078c28b7c571f76127e7d149f6b4ff73b1/src/test/java/edu/harvard/iq/dataverse/api/BatchImportIT.java#L68
15:01
cdsp-rmo
oh right
15:02
pdurbin
here's migrateDataset: https://github.com/IQSS/dataverse/blob/b8090f078c28b7c571f76127e7d149f6b4ff73b1/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java#L316
15:07
cdsp-rmo
I ended up here
15:07
cdsp-rmo
https://github.com/IQSS/dataverse/blob/ffc91db0280ba9cac21d97db15d3a2dab8acc085/src/main/java/edu/harvard/iq/dataverse/api/BatchImport.java
15:08
cdsp-rmo
ah, it seems something happened
15:08
pdurbin
\o/
15:08
cdsp-rmo
we have some cleanupLog and validationLog files
15:09
pdurbin
jri is giving you access? :)
15:09
cdsp-rmo
yep
15:09
pdurbin
jri++
15:09
cdsp-rmo
Unexpected Error in handleFile(), file:temp/export.xmlmessage: temp/export.xml, caused by: null at line:
15:09
pdurbin
huh, maybe the path is wrong?
15:10
pdurbin
that's just a first thought
15:10
cdsp-rmo
right now, it's completly right because we removed the test files from the server
15:10
cdsp-rmo
^^
15:10
pdurbin
ah, ok
15:10
pdurbin
standup in 2 minutes
15:10
cdsp-rmo
gonna try some things to match with a "good" path
15:11
cdsp-rmo
hope it will be good
15:14
pdurbin
me too. no standup today, it turns out
15:17
pameyer joined #dataverse
15:24
cdsp-rmo
the hardest thing is to understand what is the root of all of this
15:24
cdsp-rmo
the basedir for the path
15:24
cdsp-rmo
:(
15:25
pdurbin
:(
15:26
pdurbin
cdsp-rmo: I think you should create a GitHub issue so we can at least document what to do.
15:26
pdurbin
It's so confusing.
15:26
cdsp-rmo
yes, sorry
15:27
cdsp-rmo
I'm so much into it I forgot how strange the subject is ^^
15:47
donsizemore
@pdurbin tell leonid i've nearly completed re-indexing odum's production dataverse in place and i'm only using 11G of active memory!
15:51
cdsp-rmo
posted my issue here : https://github.com/IQSS/dataverse/issues/4593
15:51
cdsp-rmo
shall I mention our chat and link the irc log too ?
15:58
pdurbin
cdsp-rmo: please do!
15:58
pdurbin
donsizemore: awesome. He's on vacation though.
16:19
cdsp-rmo
well, we did some tests, and this time we have nothing in the log files
16:19
cdsp-rmo
and nothing on the dataverse
16:19
cdsp-rmo
:(
16:20
cdsp-rmo
gotta go, so I stop the tests, but it's really strange
16:23
cdsp-rmo
have a nice day ! o/
16:39
jri joined #dataverse
16:53
donsizemore joined #dataverse
16:57
pameyer joined #dataverse
17:00
pameyer
cdsp-rmo: a few months ago, I took a look into importing datasets from ddi xml as part of some migration work. For what it's worth, I wasn't able to get that approach working.
17:01
pameyer
since the datasets I was working on migrating weren't in ddi/xml, I switched approaches
17:02
pameyer
donsizemore: congrats on 4.8.6
17:10
pdurbin
jri: you'll have to thank rmo for opening https://github.com/IQSS/dataverse/issues/4593 for me
18:00
pameyer joined #dataverse
18:23
donsizemore joined #dataverse
19:50
jri joined #dataverse
20:29
pameyer joined #dataverse
20:30
pdurbin
pameyer: when you have a moment, please look at https://github.com/IQSS/dataverse/issues/4396
20:32
pameyer
pdurbin: will do
20:32
pdurbin
thanks!
23:28
jri joined #dataverse