Time
S
Nick
Message
01:23
pdurbin
woo hoo! Dataverse 4.0 is here! | Data Science - http://datascience.iq.harvard.edu/blog/dataverse-40-here
01:42
garnett joined #dataverse
03:12
balo joined #dataverse
03:19
axfelix joined #dataverse
03:26
axfelix joined #dataverse
04:38
garnett joined #dataverse
05:27
bencomp joined #dataverse
05:35
bencomp joined #dataverse
07:36
bencomp joined #dataverse
11:32
bencomp joined #dataverse
11:56
pdurbin
bencomp: https://github.com/IQSS/dataverse/issues/900#issuecomment-92781932
12:41
bencomp
pdurbin: pasting GitHub links, eh? https://github.com/IQSS/dataverse/issues/2003
12:48
pdurbin
commented!
12:49
bencomp
thanks :)
12:50
pdurbin
we apparently love to put the word "dataverse" everywhere :)
13:20
bencomp
pdurbin: did you notice that the breadcrumbs use a different source for "dataverse"? It's not "DataverseNL" in the screenshot
13:23
pdurbin
I didn't.
13:23
pdurbin
bencomp: feel free to upload more screenshots
13:32
pdurbin
bencomp: you might like this: https://apitest.dataverse.org/guides/developers/database/schemaspy/relationships.html
13:33
pdurbin
run schemaSpy on every apitest build #775 · IQSS/dataverse 184e687 - https://github.com/IQSS/dataverse/commit/184e687a33ac3ff99b8622ef858f563e61d1e393
13:35
bencomp
that's an impressively large database schema
13:37
pdurbin
I think the DVN 3.6 one may be bigger: http://dvn-vm1.hmdc.harvard.edu/schemaspy/3/3.6/schemaspy.out/relationships.html
13:37
pdurbin
yeah 74 vs 117 tables
13:40
pdurbin
jeffspies_ and LyndsySimon I should pick your brains about SHARE sometime: https://github.com/IQSS/dataverse/issues/900
13:44
LyndsySimon
pdurbin: I'm not terribly involved in SHARE.
13:47
pdurbin
LyndsySimon: ok, no worries. One of us got a demo of it a while back.
13:50
rliebz joined #dataverse
13:53
rliebz
pdurbin: Am I correct in assuming that no route from dataverse.harvard.edu should redirect to thedata.harvard.edu?
13:53
pdurbin
rliebz: hi! um.. I'm not sure
13:53
pdurbin
rliebz: did you find one?
13:54
rliebz
pdurbin: I'm trying to get the service document using https://dataverse.harvard.edu/dvn/api/data-deposit/v1.1/swordv2/service-document , but it redirects me to the old host and gives me a 404, since api v1.1 doesn't exist over there
13:55
pdurbin
huh
13:56
pdurbin
lemme try
14:00
pdurbin
rliebz: right. a 302 then a 404. uh oh.
14:03
rliebz
pdurbin: Our 4.0 changes are still pending review right now, so this isn't barring us from making the switch, but I believe this effectively prevents the SWORD API from being used
14:04
pdurbin
rliebz: well, http://thedata.harvard.edu/dvn/ (the old site) should be in read only mode right now
14:06
rliebz
pdurbin: Yes. It seems to be working without any issues as read-only.
14:06
pdurbin
ok, good
14:06
pdurbin
but we need to get rid of that redirect, it sounds like
14:06
rliebz
pdurbin: Yes. It looks like it isn't limited to just the service document, I'm also getting it from https://$HOSTNAME/dvn/api/data-deposit/v1.1/swordv2/collection/dataverse/$DATAVERSE_ALIAS
14:07
pdurbin
my guess is that it's a broad redirect
14:10
rliebz
pdurbin: Writing up an issue now
14:11
pdurbin
rliebz: that's fine. I'm also sending an email. if you're almost done I'll link the issue
14:13
rliebz
pdurbin: https://github.com/IQSS/dataverse/issues/2005
14:17
pdurbin
rliebz: commented and email sent. thanks
14:19
pdurbin
rliebz: hey, while you're here, maybe you can look at the SWORD docs a bit. I tried to clean them up last week: http://guides.dataverse.org/en/latest/api/sword.html
14:19
pdurbin
I feel like there are more known issues I should link to. Issues that you've opened.
14:20
pdurbin
rliebz: and I'd like to link to the code you're using. Maybe we can set up that new repo we talked about.
14:20
rliebz
pdurbin: Absolutely
14:21
pdurbin
rliebz: this is your latest, right? https://github.com/rliebz/dataverse-client-python/tree/4.0
14:24
rliebz
pdurbin: That is the most recent
14:32
pdurbin
rliebz: so how do you feel about me forking it to be under https://github.com/IQSS and then giving you push access? (so that all future dev can happen there)
14:33
pdurbin
(I think this is what we had talked about.)
14:34
rliebz
pdurbin: Sounds good to me. I might end up making a couple small changes before we take it into production, but the code at the 4.0 branch is pretty much ready
14:37
pdurbin
oh, interesting
14:37
pdurbin
rliebz: github knows that your fork is a fork of our old fork
14:38
pdurbin
so really I guess I should rename the old fork, as was suggested the other day
14:38
rliebz
pdurbin: Oh right
14:39
rliebz
That's probably better. GitHub is nice enough to redirect the old URLs, too
14:40
pdurbin
ok, renamed: https://github.com/IQSS/dataverse-client-python
14:48
* skay
notices a codersquid branch
14:48
pdurbin
:)
14:48
skay
forks all the way down
14:48
pdurbin
skay: you had some good ideas in there!
14:48
axfelix joined #dataverse
15:27
pdurbin
rliebz: ok, I merged your 4.0 branch to master here: https://github.com/IQSS/dataverse-client-python/commits/master
15:27
pdurbin
please take a look and let me know what you think
15:28
garnett joined #dataverse
15:29
rliebz
pdurbin: Looks good to me!
15:30
pdurbin
rliebz: so you think you'll be able to switch to developing over there? Do you want to develop on master? I'm not sure what your workflow is like.
15:34
metamattj joined #dataverse
15:35
rliebz
kpdurbin: What I plan to do is make working changes on a feature branch (4.0 was one such feature branch) and submit pull requests to IQSS/master when those features are finished. I can do those either on my own repo or IQSS, if you have any preference
15:36
rliebz
pdurbin: *
15:36
pdurbin
rliebz: feature branches are great. Can you please make them in the IQSS repo?
15:37
rliebz
pdurbin: Sure thing.
15:54
pdurbin
rliebz: I might make some pull requests for you to merge in
16:02
rliebz
pdurbin: Should be fine. Just so you know, I'm currently working on changes that will allow files to be retrieved from the native API for any version, rather than hardcoding the native route to use 'latest-published'
16:02
pdurbin
rliebz: awesome. not the files themselves but their names and descriptions etc.
16:03
rliebz
pdurbin: Right, just the list of files. The files themselves I don't think take versions
16:03
pdurbin
nope. not that I know of
16:13
pdurbin
rliebz: I have good news and bad news about SWORD.
16:16
pdurbin
http://guides.dataverse.org/en/latest/api/sword.html#list-datasets-in-a-dataverse is working for me now.
16:16
pdurbin
but I still can't retrieve the Service Document
16:18
rliebz
pdurbin: Strange
16:18
pdurbin
well...
16:18
pdurbin
I wonder if it's this: https://github.com/IQSS/dataverse/issues/784
16:20
rliebz
pdurbin: It's hanging for me now (eventually a 503), instead of the redirect. But getting the service document is something I've definitely been able to do on apitest and dataverse-demo
16:21
pdurbin
right but there were many fewer dataverses ("collections") on those servers
16:21
pdurbin
garnett: hey, are you around? we're talking SWORD
16:21
garnett
yep
16:22
garnett
thanks for the ping
16:23
garnett
what's going on, can't retrieve the service document from the new 4.0 harvard dataverse?
16:24
pdurbin
can't
16:24
pdurbin
I'm going to try to reproduce it locally.
16:24
garnett
I just reported another bug to eleni this AM
16:24
pdurbin
via github issues?
16:25
garnett
nope, just email, but I can open an issue -- thought it might just be with the one instance rather than 4.0
16:25
garnett
some of the direct download links seem to be mixed up here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/11992 (with CSV retrieving PDF and vice versa)
16:25
garnett
any chance you're having memory issues? I'm always happy to point the finger there for weird java bugs :)
16:25
pdurbin
huh
16:26
pdurbin
well, we did just boost the memory for glassfish
16:26
garnett
if you can't isolate the cause of either the mismatched links or the SWORD issues ... maybe check to see if it's leaking somewhere
16:28
pdurbin
garnett: are you able to run any SWORD tests against https://dataverse.harvard.edu ?
16:29
garnett
can I test w/o an API handshake? I don't think I've signed up for the 4.0 site yet
16:29
pdurbin
garnett: your account should have been migrated from the old site, if you had one
16:29
garnett
k, give me a sec
16:31
garnett
remind me how to retrieve token in 4.0?
16:35
pdurbin
garnett: would you like the API call or the GUI way?
16:35
garnett
API call is fine, I just went hunting for a GUI way and couldn't find it :)
16:35
pdurbin
hmm. that sounds like a usability issue
16:36
pdurbin
garnett: http://guides.dataverse.org/en/latest/user/account.html#generate-your-api-token
16:36
garnett
that worked, thanks
16:37
pdurbin
sweet
16:38
garnett
axfelix shoebox:~$ curl -u "8b2284f0-3d07-46c1-afe5-d3e93f78c967" https://dataverse.harvard.edu/dvn/api/data-deposit/v1.1/swordv2/service-document
16:38
garnett
Enter host password for user '8b2284f0-3d07-46c1-afe5-d3e93f78c967':
16:39
garnett
tried submitting a blank password or my actual DVN password, which shouldn't be necessary; got a 503 after about a minute's delay both times
16:46
pdurbin
garnett: you'll need to add a colon per http://guides.dataverse.org/en/latest/api/sword.html#new-features-as-of-v1-1
16:46
pdurbin
at least, that's how I got it to work with curl
16:48
garnett
that got rid of the password prompts but still getting the 503
16:48
bencomp joined #dataverse
16:48
pdurbin
and I'm seeing similiar behavior locally
16:49
pdurbin
rliebz or garnett: would either of you care to file a bug about this?
16:50
garnett
I'm actually getting a 503 trying to hit the site now in a browser...
16:51
garnett
so I suspect it's not actually a sword issue
16:55
pdurbin
as I was saying to rliebz I think it's this: https://github.com/IQSS/dataverse/issues/784
16:55
pdurbin
"Inefficiency in constructing the Service Document" http://guides.dataverse.org/en/latest/api/sword.html#known-issues
16:57
garnett
even though I was having issues hitting the site in a browser right afterward?
16:57
rliebz
pdurbin: That could explain why it would fail on the production site but not on the test servers
17:01
pdurbin
rliebz: yeah. I'm able to reproduce the bug locally with data similar to production (close to 3000 dataverses)
17:01
pdurbin
it's just super slow to iterate over that many
17:02
pdurbin
garnett: for all I know we are killing production by trying to get the service document. maybe we should stop
17:03
garnett
geez, I sure hope these calls aren't /that/ expensive
17:03
garnett
I'm not sure it's us, but I think your prod is definitely having some growing pains right now
17:03
pdurbin
they weren't in DVN 3.6 but we have an entirely new permissions system
17:04
pdurbin
yeah, unfortunately
17:49
metamattj joined #dataverse
17:53
pdurbin
real 24m24.143s
17:55
pdurbin
so it took 24 minutes to iterate through 2901 dataverses. and I never got the Service Document. instead it said "curl: (56) SSLRead() return error -9806"
19:13
metamattj joined #dataverse
19:45
garnett joined #dataverse
19:59
bencomp joined #dataverse
20:03
garnett joined #dataverse
20:17
garnett joined #dataverse
20:23
pdurbin
rliebz and garnett I made a new ticket, pushed a quick fix, and passed it to QA: SWORD: retrieving the Service Document is not performant with ~3000 dataverses · Issue #2012 · IQSS/dataverse - https://github.com/IQSS/dataverse/issues/2012
20:23
pdurbin
I don't know if you ever rely on group assignments or not. I'm hoping not.
20:27
rliebz
pdurbin: 4.0 is still an upgrade from what our users could do before!
20:28
pdurbin
rliebz: yeah? :)
20:29
pdurbin
anyway, instead of 24 minutes it's now taking less than a second: real 0m0.505s
20:30
pdurbin
on my machine anyway :)
20:32
rliebz
pdurbin: Those are some impressive numbers
20:33
pdurbin
rliebz: we're hoping to push a new build in a few hours
20:45
pdurbin
potatoesaretasty: welcome!
20:46
pdurbin
skay: argh! shauna tricked me
20:46
skay
haha
20:47
pdurbin
the thing is, they are tasty
20:58
garnett joined #dataverse
22:47
garnett joined #dataverse