Time
S
Nick
Message
00:23
djbrooke joined #dataverse
06:38
djbrooke joined #dataverse
07:38
jri joined #dataverse
11:09
kzisme joined #dataverse
12:37
djbrooke joined #dataverse
12:56
pameyer joined #dataverse
13:20
sekmiller joined #dataverse
13:31
pdurbin
sekmiller: good morning. You're off the hook: https://github.com/IQSS/dataverse/issues/3326#issuecomment-244365474 :)
13:32
bsilverstein joined #dataverse
13:37
pdurbin
bsilverstein: oh good, you're here. I having trouble replicating the math challenge bug.
13:38
pdurbin
I'm*
13:38
bsilverstein
pdurbin: is it appearing correctly for you?
13:38
pdurbin
bsilverstein: can you please swing by?
13:39
bsilverstein
I actually didn't even know about the math challenge until sitting in on QA and Kevin kind of arbitrarily looked to see if it was working and it happened to not be
13:39
bsilverstein
did not repeatedly recreate that one
13:39
bsilverstein
pdurbin: of course!
13:39
pdurbin
it'll be easier to explain if you look over my shoulder
13:40
pdurbin
pameyer: if bsilverstein and I can finish this up maybe I can swing by your office this afternoon.
13:50
pameyer
pdurbin: bmckinney's off-site today, so you might get a better preview next week
13:57
djbrooke joined #dataverse
14:09
djbrooke joined #dataverse
14:12
pdurbin
bsilverstein: always remember http://s3.amazonaws.com/giles/demons_010609/wtfm.jpg
14:23
nicholas_ joined #dataverse
14:36
pdurbin
nicholas_: oh, hey, were your ears burning this morning? :)
14:41
romainM joined #dataverse
14:41
romainM
hello ?
14:42
pdurbin
romainM: good morning
14:43
romainM
hey, long time no see !
14:43
romainM
but I'm back with a problem with our "pre-realease" dataverse
14:44
pdurbin
pre-release? you mean you haven't gone live yet?
14:44
romainM
something with the xml generated and sent to datacite when I try to publish a dataset
14:44
romainM
well, the dataverse is live but not deployed
14:44
romainM
there is still the big datas import to do
14:45
romainM
and given that there is some labs to deal with, it was kinda slow to treat every datas
14:45
romainM
now I can import all datas as draft mode, but cannot publish it
14:45
romainM
but I can publish dataset created manually
14:45
romainM
:S
14:46
pdurbin
I'm confused. Let me read that again.
14:46
romainM
when the publish fails, I got some log
14:46
romainM
especially this one : Response code: 400, [xml] xml error: The entity "nbsp" was referenced, but not declared.
14:46
romainM
oh sorry
14:46
romainM
I'm just getting a little bit crazy with all my imports probs ...
14:46
pdurbin
romainM: which version are you running? 4.5?
14:47
romainM
4.4
14:47
romainM
we haven't updated yet
14:48
pdurbin
romainM: I haven't upgraded https://apitest.dataverse.org to 4.5 yet if you'd like to try to reproduce the bug there.
14:48
pdurbin
It's still running 4.4.
14:48
romainM
ok
14:49
pdurbin
please let me know if you see the bug there too
14:50
pdurbin
romainM: were you saying you think the bug has to do with using DataCite rather than EZID? That apitest server is configured to use EZID.
14:50
romainM
well
14:50
romainM
the problem seems to come from an xml file
14:51
romainM
and given that, from what I understood, publishing include to send an xml to datacite (with metadatas)
14:51
romainM
I thought it was due to that
14:53
romainM
I'm importing data
14:53
romainM
will try to publish in 1-2 mins
14:53
pdurbin
romainM: oh, ok. Yes, makes sense. You're right that publishing involves sending metadata to DataCite regardless of if you're using EZID or DataCite for DOIs. From what I understand. sekmiller is the expert in this area.
14:53
romainM
I was trying to get that xml
14:53
romainM
but didn't find any way yet
14:54
romainM
the only solution I had was to try to intercept datas sent by dataverse server but ...
14:54
romainM
would take some time and skill :D
14:54
romainM
I'm gonna try a publish now
14:55
romainM
ok, it published
14:55
romainM
how does EZID works with publishing ?
14:56
pdurbin
Dataverse registers the DOI with EZID. (Out of the box.)
14:56
pdurbin
And Dataverse won't let you publish if the DOI hasn't been registered.
14:57
romainM
oh, and the error message for the published error is this:
14:57
romainM
Error – This dataset may not be published because the DataCite Service is currently inaccessible. Please try again. Does the issue continue to persist? Please contact Dataverse Support for assistance.
14:58
pdurbin
romainM: are you seeing that error on the apitest server?
14:58
romainM
given that datacite is accessible
14:58
romainM
nono
14:58
romainM
it worked on the api server
14:58
pdurbin
huh
14:58
romainM
this message come from our dataverse
14:59
pdurbin
looks like it's coming from dataset.publish.error.datacite
15:00
romainM
the full "event" log
15:00
romainM
2764 INFO retreived version: id: 257, state: DRAFT(details) edu.harvard.iq.dataverse.DatasetPage 2 sept. 2016 16:22:49.849 {levelValue=800, timeMillis=1472826169849} 2761 SEVERE This dataset may not be published because the <a href="http://status.datacite.org/ " title="DataCite ... (details) edu.harvard.iq.dataverse.DatasetPage 2 sept. 2016 16:22:49.729 {levelValue=1000, timeMillis=1472826169729} 2760 WARNING javax.ejb.TransactionRolledb
15:00
romainM
hum
15:00
romainM
not really readable ...
15:01
pdurbin
romainM: can you please attach your server.log in an email to support dataverse.org ?
15:01
romainM
yes
15:05
jri joined #dataverse
15:06
pdurbin
please mention that you're running 4.4
15:08
romainM
done
15:09
romainM
I mentionned version, datacite use and other things (scripted imported datasets, etc)
15:12
pdurbin
https://help.hmdc.harvard.edu/Ticket/Display.html?id=240514
15:12
pdurbin
romainM: thanks
15:12
pdurbin
it says this: Caused by: java.lang.RuntimeException: Response code: 400, [xml] xml error: The entity "nbsp" was referenced, but not declared... at edu.harvard.iq.dataverse.DataCiteRESTfullClient.postMetadata(DataCiteRESTfullClient.java:183)
15:12
romainM
thank you you
15:13
romainM
yes, that's what makes me say it's an xml problem
15:16
pdurbin
here's where the RuntimeException is thrown: https://github.com/IQSS/dataverse/blob/v4.4/src/main/java/edu/harvard/iq/dataverse/DataCiteRESTfullClient.java#L183
15:16
pdurbin
romainM: how did you create this dataset? You imported it somehow?
15:17
romainM
I create a "simple" dataset with the pythn api for dataverse
15:17
romainM
then I update the metadatas of the dataset with a json
15:18
pdurbin
interesting
15:18
romainM
(the update is also made with python script)
15:18
romainM
the datas come from different xlsx files
15:19
pdurbin
romainM: and when you tested against https://apitest.dataverse.org just now you also used these scripts?
15:19
romainM
gave by the labs
15:19
romainM
yes
15:19
pdurbin
huh
15:19
romainM
testGrim Dataverse
15:19
pdurbin
"works on my machine" ;)
15:19
romainM
it's the testing dataverse I created
15:19
romainM
^^
15:19
djbrooke joined #dataverse
15:20
pdurbin
it looks like you were able to publish a dataset there: https://apitest.dataverse.org/dataverse/testgrim
15:20
romainM
just wondering the difference between ezid and datacite system
15:20
romainM
yes
15:20
pdurbin
but not on your server. hmm
15:20
pdurbin
I wonder what's different.
15:20
romainM
datacite seems to "publish" doi only when you post metadatas
15:21
romainM
a xml file
15:21
romainM
(if I well understood what I tried)
15:21
romainM
and it seems the xml generated for this post is ... problematic
15:21
romainM
maybe a bad encryption or what
15:21
romainM
I ran into multiple problems with my imports
15:21
romainM
and I even could do some "strange" things
15:22
pdurbin
I'm not quite sure how it works. sekmiller would know. And there's a pull request, a refactoring I think, being worked on at https://github.com/IQSS/dataverse/pull/3146
15:22
romainM
I made a github case
15:22
romainM
for one case
15:22
pdurbin
a github case? a github issue?
15:22
romainM
issue sorry
15:22
pdurbin
just now?
15:22
romainM
nono
15:23
romainM
some weeks ago
15:23
pdurbin
which number please?
15:24
romainM
saerching it
15:24
romainM
(forgot password, had to reset, blablabla ^^")
15:24
romainM
https://github.com/IQSS/dataverse/issues/3186
15:24
romainM
I could "duplicate" some fields
15:24
romainM
in the metadatas
15:25
romainM
as you can see in the last picture, 3 "kind of data" titles appear
15:25
romainM
should not be possible, no ?
15:48
pdurbin
romainM: this reminds me of a different bug. related
15:49
pdurbin
romainM: but "can do strange things" issue is different than the one we've been talking about, right?
15:50
romainM
yes yes
15:51
romainM
it's just that, maybe, some "bugs" could pass the metadatas validation for a dataset
15:51
romainM
and make some problems with the xml generation for datacite
15:51
pdurbin
romainM: but the new issue cannot be reproduced on the apitest server running 4.4
15:51
romainM
yes, but it seems related to the datacite use
15:52
romainM
you don't use any datacite dataverse ?
15:52
pdurbin
right. apitest uses EZID instead of datacite for DOIs
15:52
pdurbin
romainM: are you asking if we have any servers configured to used DataCite instead of EZID? I don't know.
15:52
romainM
that's why I'm asking
15:53
romainM
yes :)
15:53
romainM
*what I'm asking
15:53
romainM
I don't see any other option to test this
15:53
romainM
or
15:53
romainM
if there is a way to get the xml generated for datacite
15:54
romainM
don't see other ways to find out :S
15:55
pdurbin
it's something with right? I'm looking at http://stackoverflow.com/questions/9126999/how-to-handle-html-entity-nbsp-in-xslt-without-changing-the-input-file
15:56
djbrooke joined #dataverse
15:56
romainM
somthing with " " right ? missing something or ?..
15:56
romainM
(looking at the issue)
15:57
romainM
ah
15:57
romainM
"&" nbsp ?
15:57
romainM
that's the posts I saw concerning the prob
15:58
romainM
but if that's it, it happens during the metadatas => xmlForDatacite step
15:58
romainM
maybe with the "entity" thing it could work ...
15:58
pdurbin
romainM: try it :)
15:58
romainM
but I don't have the hand for that
15:59
romainM
but I'll try something
15:59
romainM
I can access the datacite api
16:00
romainM
I'll make a script testing with and without an nbsp entity in the xml, with or without this option "entity" thing
16:00
romainM
should proc the error
16:00
pdurbin
cool
16:00
romainM
I won't be able to do it now or this week end, I'll try monday
16:00
pdurbin
monday is a holiday for us anyway :) labor day
16:00
romainM
will give you the results
16:00
romainM
ah
16:00
romainM
not for me
16:00
pdurbin
so take your time :)
16:01
romainM
yep
16:01
romainM
:D
16:05
garnett joined #dataverse
16:05
pameyer joined #dataverse
16:16
pdurbin
nicholas_: still there?
16:20
romainM
well, finally had time to do it
16:20
romainM
I reproduced the xml error
16:20
romainM
just to had a in the xml to mess it
16:22
pdurbin
romainM: you can reproduce it on apitest?
16:23
romainM
hum
16:23
romainM
I can't really do that
16:23
romainM
given that I used the datacite api for this
16:23
pdurbin
oh, oh
16:23
pdurbin
makes sense
16:24
romainM
the only thing that block me
16:24
pdurbin
but if you could get similar data into Dataverse...
16:24
romainM
is that I can't declare  
16:24
romainM
for a "Message: Content is not allowed in prolog"
16:24
romainM
like if I couldn't define things in the xml sent :S
16:24
romainM
trying to figure out why
16:26
romainM
oh
16:27
romainM
it passed
16:27
romainM
got an xml with the "nbsp
16:27
romainM
&
16:27
romainM
nbsp
16:27
romainM
had to add <!DOCTYPE space[ <!ENTITY nbsp " "> ]>
16:27
romainM
at the beginning of the xml file
16:27
romainM
arf
16:28
romainM
between the ""
16:28
pdurbin
romainM: can you reproduce a bug on apitest?
16:28
romainM
there is & nbsp
16:28
romainM
I can't do that
16:28
romainM
if you don't use datacite
16:28
pdurbin
right, right. bummer
16:28
romainM
^^
16:29
pdurbin
sekmiller: do we have any servers set up with DataCite for DOI?
16:30
pdurbin
romainM: you could provide us with some JSON to reproduce the bug on a server configured for DataCite rather than EZID? JSON to create a dataset?
16:30
romainM
yes
16:31
romainM
I send you on the same email ?
16:31
pdurbin
romainM: maybe a GitHub issue would be better.
16:32
romainM
ok
16:32
pdurbin
thanks!
16:32
romainM
I reexplain the problem ?
16:32
pdurbin
romainM: yes, please!
16:32
romainM
(asking cause I have to go in a few)
16:32
romainM
ok
16:32
pdurbin
romainM: no rush. we're off monday :)
16:33
romainM
well
16:33
pdurbin
pameyer: next week is fine
16:36
pameyer
pdurbin: great
16:37
pameyer
romainM: quick question - are you seeing the problem generating the xml, or when it gets send to datacite?
16:37
romainM
the generation
16:37
romainM
when I switch a nbsp element on the xml
16:37
romainM
gets an error with, no error without
16:38
romainM
and the error message is the one in the dataverse log
16:38
romainM
(the nbsp not defined thing)
16:38
romainM
I think the problem is a combinaison of imported datas
16:38
pameyer
ok - I'd been wondering if there was a difference in validation between datacite and ezid, but it looks like this is unrelated
16:38
romainM
(dunno how dataverse keeps it, encoding, etc)
16:38
romainM
because it's only when datas are imported
16:39
romainM
that this problem happens
16:39
romainM
with dataset created "by hand", it works
16:39
romainM
maybe ezid doesn't need metadatas related to the dataset ?
16:39
romainM
and only do a redirection ?
16:39
romainM
dunno how ezid works
16:39
romainM
for datacite, it needs metadatas first
16:40
romainM
it won't make a redirection link entry if no metadatas are given
16:40
romainM
and there goes the xml file
16:40
romainM
for the metadatas
16:41
pameyer
when I've used ezid, I've always passed metadata along with the creation request
16:43
romainM
if you have an xml example for ezid
16:43
romainM
could be usefulto compare, if structures differe
16:47
pameyer
I don't have an example from dataverse - but ezid uses (or can accept) datacite xml metadata
16:47
pdurbin
romainM: this issue is a good start but we need more details, I think: Publishing fails for script-added dataset with a Dataverse using Datacite · Issue #3328 · IQSS/dataverse - https://github.com/IQSS/dataverse/issues/3328
16:47
romainM
I'm adding
16:47
romainM
the validation was a mistake
16:47
pdurbin
romainM: ok, great. We need to know how to reproduce it, etc.
16:48
romainM
adding the files too
16:48
romainM
with log
16:48
romainM
json
16:48
pdurbin
perfect. thanks!
16:49
romainM
hum
16:49
romainM
if you want to reproduce it
16:49
romainM
you need scripts for the json import ?
16:49
romainM
or json will be enough ?
16:50
pdurbin
romainM: meh, just the JSON is fine. The JSON will have nbsp stuff in it?
16:50
romainM
well, "hidden" nbsp
16:50
pdurbin
ok
16:50
romainM
because my datasets were uploaded with this exact json kind
16:51
romainM
this exact json, actually
16:51
romainM
it's the output of one of them
16:51
pdurbin
cool. should be enough to reproduce the bug
16:51
romainM
ok
16:51
romainM
I updated
16:51
romainM
there is the log
16:51
romainM
and the json
16:51
romainM
I added the xml line
16:52
romainM
oh
16:52
romainM
now that I think about it, the "space" was interpreted in the post ><
16:53
pdurbin
romainM: I don't see "nbsp" in the JSON .
16:53
romainM
because there is none
16:53
romainM
what I upload
16:53
romainM
don't have this
16:53
romainM
but when the xml is made
16:53
romainM
I don't know why, dataverse seems to add it
16:53
pdurbin
huh
16:54
romainM
that's the point I don't understand
16:54
romainM
now sorry, I really have to go
16:54
romainM
but I can be on my phone
16:54
pdurbin
romainM: have a good weekend! thanks!
16:54
romainM
(I won't have access to computer, that's my point)
16:55
RomainMPhone joined #dataverse
16:56
RomainMPhone
So mike i said, there is no nbsp in my file, maybe encoding problem at some point ?
16:58
RomainMPhone joined #dataverse
16:58
RomainMPhone
Sorry, connection switch :S
17:00
djbrooke joined #dataverse
17:01
pdurbin
RomainMPhone: heh, no worries. I'm impressed by your dedication! :)
17:07
djbrooke joined #dataverse
17:07
metamattj joined #dataverse
17:17
djbrooke joined #dataverse
17:28
RomainMPhone joined #dataverse
17:29
RomainMPhone
It's also because I want it to work :D
17:29
RomainMPhone
There is a pseudo dead line for datasets publication near 13 of september
17:31
pdurbin
!
17:31
pdurbin
djbrooke: not enough time to roll a fix into 4.5.1
17:32
RomainMPhone95 joined #dataverse
17:33
RomainMPhone95
Well
17:33
RomainMPhone95
I got a solution for that
17:33
RomainMPhone95
"In case"
17:34
RomainMPhone95
But I would prefer a clean way, yeah :D
17:37
pdurbin
RomainMPhone95: is your solution a pull request? :)
17:45
pdurbin
pameyer: dunno if you remember what was on http://dataverse.org/releases-roadmap but we just removed one of the tabs and now we're promising nothing with regard to large scale data or whatever by the fall
17:52
pameyer
4.6 and 4.7 are fall?
17:58
pdurbin
hmm, good question
17:58
pdurbin
I imagine we'll ship *something* in the fall!
18:03
RomainMPhone joined #dataverse
18:03
RomainMPhone
No, my solution is not a pull request ... it's an ugly python script :D
18:04
RomainMPhone
For a pull request, I should do clean code and tests ... omg ! :D
18:05
RomainMPhone
And my java skill is kinda rusty now :S
18:05
pameyer
RomainMPhone - I think you're on the right track about it being an encoding issue
18:05
RomainMPhone
But if this is my last resort (tudu :D), i'll give it a look
18:06
RomainMPhone
I'll try to analyse my datas in a first place
18:06
donsizemore joined #dataverse
18:07
RomainMPhone
Just in case (even if it's clearly the case,99% chance given all we got now)
18:07
RomainMPhone
Do you know what is dataverse encoding ?
18:08
pameyer
from a very quick look, everything was utf-8 internally
18:08
RomainMPhone
I mean, do you have a "clear" encoding used like utf8 ?
18:08
RomainMPhone
Ok
18:08
pameyer
I'm not the best person to have an opinion on it though
18:08
RomainMPhone
My python output is utf8 ...
18:08
pameyer
are you using python2?
18:09
RomainMPhone
2.7
18:09
pameyer
this is sounding familiar to something I ran across a while back....
18:09
pameyer
this may not be your problem, but I ran into issues with utf8 and python2 urllib
18:10
pameyer
from memory (aka - exact details may be off), with python2 utf8 encoded strings sent to remote urls were translated to ascii inside one of the libraries, and that was causing problems
18:11
pameyer
maybe python3 or python "requests" library would help you?
18:11
pdurbin
oh, there's a chance this isn't a bug in Dataverse?
18:11
pameyer
but if you're on your phone - probably not something you can check
18:11
pameyer
pdurbin: potentially, it could be something else
18:12
* pdurbin
puts his feet up
18:12
djbrooke_ joined #dataverse
18:13
RomainMPhone
I'll take a look
18:13
RomainMPhone
Diner time :S
18:13
RomainMPhone
Ty anayway
18:14
pdurbin
thanks
18:16
pdurbin
pameyer: I love https://github.com/IQSS/dataverse/pull/3329 but I hate that there are no tests for you to change to make sure they continue to pass. :/
18:16
RomainMPhone joined #dataverse
18:17
RomainMPhone
Coudn't restrain myself : I do use requests
18:17
RomainMPhone
(Go really away)
18:19
pdurbin
donsizemore: ping!
18:35
djbrooke joined #dataverse
18:51
djbrooke joined #dataverse
19:02
djbrooke_ joined #dataverse
19:03
djbrooke_ joined #dataverse
19:04
djbrooke_ joined #dataverse
19:20
jri joined #dataverse
19:29
jri_ joined #dataverse
19:45
jri joined #dataverse
20:12
pdurbin
pameyer: what time should I show up on Tuesday for the pre-demo demo?
20:14
pameyer
did bmckinney email you?
20:14
pdurbin
we've been slacking or whatever
20:14
pameyer
ah - gotcha
20:14
pdurbin
perhaps it's time to start a channel
20:17
djbrooke joined #dataverse
20:19
jri joined #dataverse
20:22
djbrooke joined #dataverse
20:26
jri_ joined #dataverse
20:27
garnett joined #dataverse
20:35
djbrooke joined #dataverse
23:03
agarnett joined #dataverse
23:17
djbrooke joined #dataverse