IQSS logo

IRC log for #dataverse, 2015-04-14

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
01:23 pdurbin woo hoo! Dataverse 4.0 is here! | Data Science - http://datascience.iq.harvard.edu/blog/dataverse-40-here
01:42 garnett joined #dataverse
03:12 balo joined #dataverse
03:19 axfelix joined #dataverse
03:26 axfelix joined #dataverse
04:38 garnett joined #dataverse
05:27 bencomp joined #dataverse
05:35 bencomp joined #dataverse
07:36 bencomp joined #dataverse
11:32 bencomp joined #dataverse
11:56 pdurbin bencomp: https://github.com/IQSS/dataverse/issues/900#issuecomment-92781932
12:41 bencomp pdurbin: pasting GitHub links, eh? https://github.com/IQSS/dataverse/issues/2003
12:48 pdurbin commented!
12:49 bencomp thanks :)
12:50 pdurbin we apparently love to put the word "dataverse" everywhere :)
13:20 bencomp pdurbin: did you notice that the breadcrumbs use a different source for "dataverse"? It's not "DataverseNL" in the screenshot
13:23 pdurbin I didn't.
13:23 pdurbin bencomp: feel free to upload more screenshots
13:32 pdurbin bencomp: you might like this: https://apitest.dataverse.org/guides/developers/database/schemaspy/relationships.html
13:33 pdurbin run schemaSpy on every apitest build #775 · IQSS/dataverse@184e687 - https://github.com/IQSS/dataverse/commit/184e687a33ac3ff99b8622ef858f563e61d1e393
13:35 bencomp that's an impressively large database schema
13:37 pdurbin I think the DVN 3.6 one may be bigger: http://dvn-vm1.hmdc.harvard.edu/schemaspy/3/3.6/schemaspy.out/relationships.html
13:37 pdurbin yeah 74 vs 117 tables
13:40 pdurbin jeffspies_ and LyndsySimon I should pick your brains about SHARE sometime: https://github.com/IQSS/dataverse/issues/900
13:44 LyndsySimon pdurbin: I'm not terribly involved in SHARE.
13:47 pdurbin LyndsySimon: ok, no worries. One of us got a demo of it a while back.
13:50 rliebz joined #dataverse
13:53 rliebz pdurbin: Am I correct in assuming that no route from dataverse.harvard.edu should redirect to thedata.harvard.edu?
13:53 pdurbin rliebz: hi! um.. I'm not sure
13:53 pdurbin rliebz: did you find one?
13:54 rliebz pdurbin: I'm trying to get the service document using https://dataverse.harvard.edu/dvn/api/data-deposit/v1.1/swordv2/service-document, but it redirects me to the old host and gives me a 404, since api v1.1 doesn't exist over there
13:55 pdurbin huh
13:56 pdurbin lemme try
14:00 pdurbin rliebz: right. a 302 then a 404. uh oh.
14:03 rliebz pdurbin: Our 4.0 changes are still pending review right now, so this isn't barring us from making the switch, but I believe this effectively prevents the SWORD API from being used
14:04 pdurbin rliebz: well, http://thedata.harvard.edu/dvn/ (the old site) should be in read only mode right now
14:06 rliebz pdurbin: Yes. It seems to be working without any issues as read-only.
14:06 pdurbin ok, good
14:06 pdurbin but we need to get rid of that redirect, it sounds like
14:06 rliebz pdurbin: Yes. It looks like it isn't limited to just the service document, I'm also getting it from https://$HOSTNAME/dvn/api/data-deposit/v1.1/s​wordv2/collection/dataverse/$DATAVERSE_ALIAS
14:07 pdurbin my guess is that it's a broad redirect
14:10 rliebz pdurbin: Writing up an issue now
14:11 pdurbin rliebz: that's fine. I'm also sending an email. if you're almost done I'll link the issue
14:13 rliebz pdurbin: https://github.com/IQSS/dataverse/issues/2005
14:17 pdurbin rliebz: commented and email sent. thanks
14:19 pdurbin rliebz: hey, while you're here, maybe you can look at the SWORD docs a bit. I tried to clean them up last week: http://guides.dataverse.org/en/latest/api/sword.html
14:19 pdurbin I feel like there are more known issues I should link to. Issues that you've opened.
14:20 pdurbin rliebz: and I'd like to link to the code you're using. Maybe we can set up that new repo we talked about.
14:20 rliebz pdurbin: Absolutely
14:21 pdurbin rliebz: this is your latest, right? https://github.com/rliebz/dataverse-client-python/tree/4.0
14:24 rliebz pdurbin: That is the most recent
14:32 pdurbin rliebz: so how do you feel about me forking it to be under https://github.com/IQSS and then giving you push access? (so that all future dev can happen there)
14:33 pdurbin (I think this is what we had talked about.)
14:34 rliebz pdurbin: Sounds good to me. I might end up making a couple small changes before we take it into production, but the code at the 4.0 branch is pretty much ready
14:37 pdurbin oh, interesting
14:37 pdurbin rliebz: github knows that your fork is a fork of our old fork
14:38 pdurbin so really I guess I should rename the old fork, as was suggested the other day
14:38 rliebz pdurbin: Oh right
14:39 rliebz That's probably better. GitHub is nice enough to redirect the old URLs, too
14:40 pdurbin ok, renamed: https://github.com/IQSS/dataverse-client-python
14:48 * skay notices a codersquid branch
14:48 pdurbin :)
14:48 skay forks all the way down
14:48 pdurbin skay: you had some good ideas in there!
14:48 axfelix joined #dataverse
15:27 pdurbin rliebz: ok, I merged your 4.0 branch to master here: https://github.com/IQSS/dataverse-client-python/commits/master
15:27 pdurbin please take a look and let me know what you think
15:28 garnett joined #dataverse
15:29 rliebz pdurbin: Looks good to me!
15:30 pdurbin rliebz: so you think you'll be able to switch to developing over there? Do you want to develop on master? I'm not sure what your workflow is like.
15:34 metamattj joined #dataverse
15:35 rliebz kpdurbin: What I plan to do is make working changes on a feature branch (4.0 was one such feature branch) and submit pull requests to IQSS/master when those features are finished. I can do those either on my own repo or IQSS, if you have any preference
15:36 rliebz pdurbin: *
15:36 pdurbin rliebz: feature branches are great. Can you please make them in the IQSS repo?
15:37 rliebz pdurbin: Sure thing.
15:54 pdurbin rliebz: I might make some pull requests for you to merge in
16:02 rliebz pdurbin: Should be fine. Just so you know, I'm currently working on changes that will allow files to be retrieved from the native API for any version, rather than hardcoding the native route to use 'latest-published'
16:02 pdurbin rliebz: awesome. not the files themselves but their names and descriptions etc.
16:03 rliebz pdurbin: Right, just the list of files. The files themselves I don't think take versions
16:03 pdurbin nope. not that I know of
16:13 pdurbin rliebz: I have good news and bad news about SWORD.
16:16 pdurbin http://guides.dataverse.org/en/latest/api/sword.html#list-datasets-in-a-dataverse is working for me now.
16:16 pdurbin but I still can't retrieve the Service Document
16:18 rliebz pdurbin: Strange
16:18 pdurbin well...
16:18 pdurbin I wonder if it's this: https://github.com/IQSS/dataverse/issues/784
16:20 rliebz pdurbin: It's hanging for me now (eventually a 503), instead of the redirect. But getting the service document is something I've definitely been able to do on apitest and dataverse-demo
16:21 pdurbin right but there were many fewer dataverses ("collections") on those servers
16:21 pdurbin garnett: hey, are you around? we're talking SWORD
16:21 garnett yep
16:22 garnett thanks for the ping
16:23 garnett what's going on, can't retrieve the service document from the new 4.0 harvard dataverse?
16:24 pdurbin can't
16:24 pdurbin I'm going to try to reproduce it locally.
16:24 garnett I just reported another bug to eleni this AM
16:24 pdurbin via github issues?
16:25 garnett nope, just email, but I can open an issue -- thought it might just be with the one instance rather than 4.0
16:25 garnett some of the direct download links seem to be mixed up here: https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/11992 (with CSV retrieving PDF and vice versa)
16:25 garnett any chance you're having memory issues? I'm always happy to point the finger there for weird java bugs :)
16:25 pdurbin huh
16:26 pdurbin well, we did just boost the memory for glassfish
16:26 garnett if you can't isolate the cause of either the mismatched links or the SWORD issues ... maybe check to see if it's leaking somewhere
16:28 pdurbin garnett: are you able to run any SWORD tests against https://dataverse.harvard.edu ?
16:29 garnett can I test w/o an API handshake? I don't think I've signed up for the 4.0 site yet
16:29 pdurbin garnett: your account should have been migrated from the old site, if you had one
16:29 garnett k, give me a sec
16:31 garnett remind me how to retrieve token in 4.0?
16:35 pdurbin garnett: would you like the API call or the GUI way?
16:35 garnett API call is fine, I just went hunting for a GUI way and couldn't find it :)
16:35 pdurbin hmm. that sounds like a usability issue
16:36 pdurbin garnett: http://guides.dataverse.org/en/latest/user/account.html#generate-your-api-token
16:36 garnett that worked, thanks
16:37 pdurbin sweet
16:38 garnett axfelix@shoebox:~$ curl -u "8b2284f0-3d07-46c1-afe5-d3e93f78c967" https://dataverse.harvard.edu/dvn/api/data-deposit/v1.1/swordv2/service-document
16:38 garnett Enter host password for user '8b2284f0-3d07-46c1-afe5-d3e93f78c967':
16:39 garnett tried submitting a blank password or my actual DVN password, which shouldn't be necessary; got a 503 after about a minute's delay both times
16:46 pdurbin garnett: you'll need to add a colon per http://guides.dataverse.org/en/latest/api/sword.html#new-features-as-of-v1-1
16:46 pdurbin at least, that's how I got it to work with curl
16:48 garnett that got rid of the password prompts but still getting the 503
16:48 bencomp joined #dataverse
16:48 pdurbin and I'm seeing similiar behavior locally
16:49 pdurbin rliebz or garnett: would either of you care to file a bug about this?
16:50 garnett I'm actually getting a 503 trying to hit the site now in a browser...
16:51 garnett so I suspect it's not actually a sword issue
16:55 pdurbin as I was saying to rliebz I think it's this: https://github.com/IQSS/dataverse/issues/784
16:55 pdurbin "Inefficiency in constructing the Service Document" http://guides.dataverse.org/en/latest/api/sword.html#known-issues
16:57 garnett even though I was having issues hitting the site in a browser right afterward?
16:57 rliebz pdurbin: That could explain why it would fail on the production site but not on the test servers
17:01 pdurbin rliebz: yeah. I'm able to reproduce the bug locally with data similar to production (close to 3000 dataverses)
17:01 pdurbin it's just super slow to iterate over that many
17:02 pdurbin garnett: for all I know we are killing production by trying to get the service document. maybe we should stop
17:03 garnett geez, I sure hope these calls aren't /that/ expensive
17:03 garnett I'm not sure it's us, but I think your prod is definitely having some growing pains right now
17:03 pdurbin they weren't in DVN 3.6 but we have an entirely new permissions system
17:04 pdurbin yeah, unfortunately
17:49 metamattj joined #dataverse
17:53 pdurbin real 24m24.143s
17:55 pdurbin so it took 24 minutes to iterate through 2901 dataverses. and I never got the Service Document. instead it said "curl: (56) SSLRead() return error -9806"
19:13 metamattj joined #dataverse
19:45 garnett joined #dataverse
19:59 bencomp joined #dataverse
20:03 garnett joined #dataverse
20:17 garnett joined #dataverse
20:23 pdurbin rliebz and garnett I made a new ticket, pushed a quick fix, and passed it to QA: SWORD: retrieving the Service Document is not performant with ~3000 dataverses · Issue #2012 · IQSS/dataverse - https://github.com/IQSS/dataverse/issues/2012
20:23 pdurbin I don't know if you ever rely on group assignments or not. I'm hoping not.
20:27 rliebz pdurbin: 4.0 is still an upgrade from what our users could do before!
20:28 pdurbin rliebz: yeah? :)
20:29 pdurbin anyway, instead of 24 minutes it's now taking less than a second: real 0m0.505s
20:30 pdurbin on my machine anyway :)
20:32 rliebz pdurbin: Those are some impressive numbers
20:33 pdurbin rliebz: we're hoping to push a new build in a few hours
20:45 pdurbin potatoesaretasty: welcome!
20:46 pdurbin skay: argh! shauna tricked me
20:46 skay haha
20:47 pdurbin the thing is, they are tasty
20:58 garnett joined #dataverse
22:47 garnett joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.