IQSS logo

IRC log for #dataverse, 2016-04-04

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:10 garnett joined #dataverse
01:00 axfelix joined #dataverse
01:14 axfelix joined #dataverse
02:51 garnett joined #dataverse
03:46 garnett joined #dataverse
04:54 axfelix joined #dataverse
07:37 pdurbin joined #dataverse
07:54 jri joined #dataverse
09:09 bencomp joined #dataverse
12:58 yoh Good morning!  so how should I provide API token for search query? http://guides.dataverse.org/en/4.3/api/search.html  seems to be lacking any pointer for the option to use etc
12:58 yoh atm getting just  urllib2.HTTPError: HTTP Error 401: Unauthorized
13:01 yoh ok -- on http://guides.dataverse.org/en/4.3/api/dataaccess.html  found at the end "key" but seems not good enough -- the same error
13:02 yoh see e.g. http://pastebin.com/RyWTHE4F
13:03 yoh stupid me should have used proper server I guess (https://dataverse.harvard.edu)... so it works!
13:13 pdurbin yoh: you're all set? Yeah, API tokens are per-server.
13:14 yoh question:  since some persistent IDs (i.e. DOIs) are the reused generic DOI for the dataset (not pointing to dataverse), I wondered what could be the logical layout on the filesystem..
13:15 yoh i.e. originally I thought to reuse the within DOI identifier
13:15 pdurbin we only mint DOIs at the dataset level
13:15 yoh e.g. for doi:10.7910/DVN/ARKOTI  to have DVN/ARKOTI
13:15 yoh but then some are like doi:10.3886/ICPSR28683.v2
13:15 yoh so now I wonder how to organize it all
13:16 pdurbin that ICPSR one looks "harvested"...
13:17 pdurbin yeah https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.3886/ICPSR28683.v2 redirects (properly) to http://www.icpsr.umich.edu/icpsrweb/NACDA/studies/28683/version/2
13:19 yoh so -- do you see any logical way to reflect datasets "hierarchy" in a local directory structure from what I receive in json queries upon search?
13:19 pdurbin https://dataverse.harvard.edu/dataverse/icpsr is a child of https://dataverse.harvard.edu/dataverse/harvested
13:21 pdurbin yoh: in Solr I persist the position of each dataverse, dataset and file within the tree but I don't think we expose this to API users.
13:21 pdurbin If I didn't persist the position in Solr I wouldn't be able to support the "subtree" query parameter at http://guides.dataverse.org/en/4.3/api/search.html
13:22 yoh so the logical one for me would have been subtree if it was exposed to API users (like myself), correct? ;)
13:23 pdurbin Everything in Dataverse 4 is organized into a tree, so yes. I think. :)
13:24 pdurbin Under the "Dataverses" API there's a "contents" endpoint documented at http://guides.dataverse.org/en/4.3/api/native-api.html but it's a little buggy: https://github.com/IQSS/dataverse/issues/2122
13:28 yoh sorry for bombardment...  in http://guides.dataverse.org/en/4.3/api/native-api.html  can't find how to get a list of known dataverses and their ids so I could use id in $id/contents query
13:29 yoh somehow DOI's leading "group" could be used?
13:31 pdurbin yoh: the Search API has an (undocumented) query parameter (show_entity_ids=true) that I hoped would be less necessary these days.
13:31 pdurbin yoh: but for a dataverse id you can use the "alias" in addition to the database id
13:32 pdurbin the Search API refers to a dataverse alias as an "identifier" ... "chestnuttrees" at http://guides.dataverse.org/en/4.3/api/search.html
13:35 yoh btw tried  "https://dataverse.harvard.edu/api/dataverses/icpsr/contents?key=..." and got ERROR 500: Internal Server Error ;-)
13:36 yoh as for entity_id -- seems to be unique per dataset so I guess those are analogous to persistent_ids for datasets, not smth I could use to identify which dataverse it comes from ... just thought to avoid querying dataset's info itself at that stage
13:37 pdurbin yoh: do you get a 500 error for a dataverse that doesn't have harvested datasets?
13:37 yoh gimme any id to try ;)
13:37 pdurbin yoh: uh. "cfa"
13:38 yoh I am all newbee here
13:38 yoh for cfa got smth
13:38 pdurbin smth?
13:38 pdurbin searchbot: lucky smth
13:38 searchbot pdurbin: http://www.urbandictionary.com/define.php?term=smth
13:39 pdurbin no 500 error for cfa. good
13:39 pdurbin yoh: if you would open a github issue about icpsr and the contents API I'd appreciate it
13:41 pdurbin yoh: the entity id and the persistent id live on the same database table (dataset) so, yes, they are analogous. technically the persitent id muliple fields but whatever :) http://phoenix.dataverse.org/schemaspy/latest/tables/dataset.html
13:41 yoh done master https://github.com/IQSS/dataverse/issues/3058
13:42 pdurbin yoh: thanks! but please regenerate your key
13:42 yoh why?
13:43 pdurbin because I see it on line 2 of your issue
13:43 yoh ha ha
13:43 yoh missed that
13:43 yoh sure
13:46 pdurbin thanks
13:48 yoh and please don't hesitate to restrain me if I you feel I am placing too much of dummy load on your servers ;)  e.g. now trying to query contents for  id 1 ;)
13:49 yoh btw -- persistentUrl (returned in /contents for a dataset) is not the same as persistentId (for datasets), right?  just that I don't see persistentId in the output
13:52 yoh indeed sounds like that #2122 ...
13:52 pameyer joined #dataverse
13:53 yoh I wonder if /contents query could have some additional option to e.g. list only dataverses or datassets, so then I could query each dataverse separately, and separately the "root" dataverse itself
14:03 yoh so some persistent ids are doi: and some hdl: ...  I guess Ibetter wait until I get some 'subtree' entry exposed and then proceed from there... ?
14:20 pdurbin yoh: sorry. in and out. Do you have a question?
14:33 yoh pdurbin:  no problem... monday is monday... I should probably do smth else as well atm ;)  question is pretty much the same...
14:34 yoh what should be an ideal workflow for me to interact with dataverse to establish local filesystem hierarchy of dataverses/datasets, which would point to dataverse contents
14:35 yoh and I thought that if subtree was exposed, it would be most logical
14:35 yoh atm I would need to use search API, which doesn't provide any information about subtree or from which dataverse given dataset came
14:41 pdurbin yoh: maybe sleep on it if you need to but please feel free to open a GitHub issue if there's anything you need API-wise. Or email the list: https://groups.google.com/forum/#!forum/dataverse-community
14:42 pdurbin really there is no filesystem hiearchy of dataverses and datasets. it's all just expressed as who your "parent" is in the database.
14:43 pdurbin well, in the database we call it an "owner" rather than a parent: http://phoenix.dataverse.org/schemaspy/latest/tables/dvobject.html
14:56 axfelix joined #dataverse
14:57 yoh pdurbin: ;) ok -- will sleep on it for a bit ;)  cheers and thanks
15:19 pdurbin yoh: you're quite welcome. Again, I was excited to see you open https://github.com/datalad/datalad/issues/393 and I'm happy to help.
17:21 cnk joined #dataverse
17:35 pdurbin pameyer: ping
17:35 pameyer pdurbin: pong
17:35 pdurbin pameyer: I'm planning on skipping this afternoon's meeting.
17:36 pameyer ok
18:02 jri joined #dataverse
20:49 pdurbin pameyer: hope you had a good meeting. Time to bike in the snow!
20:54 axfelix joined #dataverse
20:56 pameyer pdurbin: enjoy the spring-time biking

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.