IQSS logo

IRC log for #dataverse, 2016-10-10

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
07:06 jri joined #dataverse
07:54 romainM joined #dataverse
08:49 bricas_ joined #dataverse
11:34 romainM pdurbin: hello ! Any news about my oai-pmh pull request ?
11:34 romainM arf, not there, nvm
11:36 romainM18 joined #dataverse
12:00 romainM joined #dataverse
12:29 donsizemore joined #dataverse
17:15 jri joined #dataverse
17:15 jri joined #dataverse
17:26 donsizemore joined #dataverse
18:27 jri joined #dataverse
20:51 galileo_ joined #dataverse
20:51 galileo_ I have a very basic question about how to use the API
20:52 galileo_ I'd like to write a script that fetches a specific file on the harvard dataverse
20:52 galileo_ But I haven't found any examples in the documentation that show, from beginning to end, how to do this
20:52 galileo_ The main problem seems to be authentication
20:53 galileo_ I haven't found any guide that explains how API/session keys work, what they are, etc.
20:53 galileo_ Can anyone shed light on this?
20:54 pdurbin joined #dataverse
20:54 pdurbin galileo_: hi! Yes, you'll probably need an API token.
20:55 galileo_ Okay, I have a really basic question: what is an API token?
20:56 pdurbin romainM: can you please leave a comment on https://github.com/IQSS/dataverse/issues/3307 about it?
20:58 pdurbin galileo_: well, we didn't want people to use their usernames and passwords with the API. So we have people use a "token" instead, which looks something like this: 54b143b5-d001-4254-afc0-a1c0f6a5b5a7
20:58 pdurbin galileo_: here's how to create one: http://guides.dataverse.org/en/4.5.1/user/account.html#create-your-api-token
20:58 galileo_ So the token is tied to me, the user?
20:59 galileo_ I'm trying to write a tool that anyone - even people who don't know what the dataverse is - can use.
20:59 galileo_ It needs access to a number of specific datasets
21:03 pdurbin galileo_: ok, so you'll probably want to make an account for your tool
21:04 pdurbin if you want your tool to be able to use the Search API, for example
21:05 galileo_ If the API key is publicly known (it'll be in the git repo for the tool), is it theoretically possible for anyone else to use the same API key for their tool?
21:09 pdurbin Oh, I see. Yeah, that's not ideal. The API token should be treated like a password and not checked into a git repo. It should be an environment variable or in a config file or something.
21:10 pdurbin galileo_: I've struggled with this myself at https://github.com/IQSS/dataverse-android/issues/1
21:11 galileo_ Does that mean that there's no safe way to distribute a tool that downloads datasets from the Dataverse?
21:13 pdurbin galileo_: well, the API token should go in a config file
21:13 pdurbin galileo_: I take it you're not building a web app. This is some sort of command line client or GUI client?
21:14 galileo_ It's a python package, which will download the necessary datasets upon installation - so, basically, a commandline client
21:14 pdurbin ok
21:14 pdurbin galileo_: if the data is CC0 you probably don't need an API token at all
21:15 pdurbin galileo_: you'd start with the DOI and get a list of files out of the JSON. For example: https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=doi:10.7910/DVN/ARKOTI
21:16 pdurbin galileo_: and once you have the file IDs you'd download each file (again, assuming the data isn't restricted)
21:17 galileo_ The data is CC0, so that works.
21:17 pdurbin cool
21:20 pdurbin galileo_: have you created the git repo already?
21:24 galileo_ Yes, but it doesn't automatically download the data yet
21:25 pdurbin galileo_: gotcha. Well, please give us a link to the repo whenever you're ready.
21:26 galileo_ pdurbin: It's at https://github.com/gregreen/dustmaps
21:27 galileo_ pdurbin: And it's going to access this dataset: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/40C44C
21:28 pdurbin galileo_: ok, so we'll get the file ids from https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=doi:10.7910/DVN/40C44C
21:29 pdurbin curl https://dataverse.harvard.edu/api/datasets/:persistentId?persistentId=doi:10.7910/DVN/40C44C | jq .data.latestVersion.files[].dataFile.id
21:29 pdurbin 2692345 and 2689340
21:30 pdurbin https://dataverse.harvard.edu/api/access/datafile/2692345
21:31 pdurbin https://dataverse.harvard.edu/api/access/datafile/2689340
21:31 pdurbin galileo_: no API token required
21:31 galileo_ Great, that works
21:31 pdurbin :)
21:31 galileo_ pdurbin: Thanks a lot for your help!
21:32 pdurbin galileo_: oh sure. What do you think would help the next person? A FAQ about Dataverse APIs?
21:33 galileo_ pdurbin: A very simple example in the documentation, which explains what API keys are, when they are and aren't needed, and how to download a file.
21:33 galileo_ Maybe that would go well in a FAQ
21:38 pdurbin galileo_: make sense. Thanks.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.