IQSS logo

IRC log for #dataverse, 2019-08-09

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
11:57 donsizemore joined #dataverse
12:14 donsizemore @pdurbin on your dataverse talk: as Dmitri says, fleshed-out examples of which identifiers for which endpoint would be very useful for newbies and oldies alike
12:35 pdurbin_m joined #dataverse
12:36 pdurbin_m donsizemore: instead of database IDs for datasets? Show DOI examples throughout?
12:53 pdurbin_m What do you mean by "identifiers"?
13:22 pdurbin donsizemore: I have at least two new crazy ideas about API docs if you'd like to hear them. :)
13:31 donsizemore @pdurbin hey hey, back from coffee and three minor crises
13:32 pdurbin so four crises, really... no coffee
13:33 donsizemore @pdurbin when cheryl and i fiddle with file upload api for instance, native vs sword vs zipupload may want dbid or persistentId, sometimes at the end, and yes consistent usage and documentation would be killer
13:33 pdurbin Right. Are you still in favor of full curl examples throughout?
13:34 donsizemore @pdurbin yes and i think i volunteered to do that, didn't i. but even then the usage is different for different endpoints
13:34 donsizemore what are your not-so-crazy API doc ideas?
13:34 pdurbin Huh. How is the usage different?
13:36 donsizemore in http://guides.dataverse.org/en/latest/api/native-api.html#add-a-file-to-a-dataset for instance the two examples conflict
13:39 pdurbin woof, they sure do
13:40 pdurbin This is a case where someone wrote the short example and instead of deleting it I added a full curl example below it.
13:40 pdurbin Should I just delete the short one? I don't think it adds much value, to be honest.
13:40 donsizemore but that's just me being whiny. now that i ran to the restroom and thought about it,
13:41 donsizemore the real issue for us with the API is that our archivists are on Windows. so you open up bash for Win10 and they're all "whut"
13:41 donsizemore not a real issue, but depending on the audience for your talk, you might try to suss out their starting points early on?
13:41 pdurbin Hmm. Does pyDataverse work on Windows?
13:42 donsizemore i'm thinking i'd like to test that
13:42 pdurbin I know it shells out to curl for file upload right now so I'll bet a beer that doesn't "just work" for most people on Windows. :)
13:42 pdurbin I think I promised Stefan a pull request. :)
13:43 donsizemore the syntax was a little different but it was last week so i don't remember exactly how. i can pester cheryl for her cheat sheet we started
13:43 pdurbin I bet I can get it working from requests.
13:43 donsizemore because we did get the file upload API working from her windows box, just ran into network timeouts
13:43 pdurbin That's awesome that you got it to work.
13:43 pdurbin Don't forget about Jim's DVUploader.
13:43 pdurbin He just put out a new release yesterday.
13:43 donsizemore that's a java jar, right?
13:43 pdurbin yeah
13:44 donsizemore equal discomfort for an archivist used to a windows desktop
13:44 pdurbin Sure.
13:44 pdurbin What do they want? An exe with a GUI?
13:44 donsizemore so, i'm just talking off the top of my head about your talk, but what's actionable from that?
13:45 donsizemore a java GUI might be a pain in the butt but would be quite popular
13:45 donsizemore you've already got the web interface for most of it
13:45 pdurbin I'm going to act on a lot of this conversation. You're good. I don't need anything from you. Except that I do still have those two crazy ideas in my head and they want to escape.
13:46 donsizemore um, i forget if we had a "full curl examples" issue, but checking the Native API page for correct, full examples against 4.15 wouldn't be a bad use of maybe an intern?
13:46 pdurbin Yeah, I'm not sure I want to take all that on right now.
13:47 donsizemore we have a smart info science student who's about to start for our archive group, i can try to harang her into taking that on
13:47 pdurbin And I doubt Kevin wants to test it all. :)
13:47 donsizemore i'd have to offer thu-mai pastries or something
13:47 pdurbin That would be great!
13:47 donsizemore i'll ask. that much i _can_ promise =)
13:48 pdurbin :)
13:49 pdurbin Ready for the first crazy idea?
13:49 donsizemore yes yes
13:50 pdurbin Are you familiar with Swagger?
13:51 pdurbin It also has a newer name these days.
13:51 pdurbin The newer name is OpenAPI.
13:51 pdurbin Here's the issue about it: https://github.com/IQSS/dataverse/issues/5794
13:54 donsizemore ooh, and it's in payara5
13:55 pdurbin Maybe you see where I'm going with this. :)
14:03 donsizemore i like it
14:03 pdurbin I'm thinking a Jenkins job.
14:05 donsizemore we can easily build off that branch and test its API
14:06 pdurbin Well, it should work on any newish branch.
14:06 pdurbin We've merged enough fixes to get the core of Dataverse to deploy to the latest Payara version.
14:07 pdurbin Oh, you probably mean "test the /openapi" from Payara, right?
14:07 andrewSC joined #dataverse
14:14 donsizemore excellent. i did, but what do i know
14:16 pdurbin To be clear, what I really want for everybody is the nice HTML output from Swagger. Are you familiar with how this looks?
14:20 pdurbin I linked to an example in the GitHub issue above.
14:20 pdurbin A pretty old example. It might look different these days.
14:33 donsizemore i remember it from miniverse
14:36 pdurbin Cool. So the crazy idea is to deploy Dataverse to Payara 5. Then hit that /openapi. Then build the HTML for the Swagger docs. I don't have a strong preference for where the HTML is hosted right now. It could be hosted on an EC2 instance spun up with dataverse-ansible. Or it could be dumped somewhere on the Jenkins server. Whatever is easier.
14:44 pdurbin Does that make sense?
15:15 xarthisius pdurbin: Is there a way to check whether a dataset I'm accessing through the API is public or private (i.e. I can only access it cause I provided my API key) ?
15:22 donsizemore @pdurbin sounds good to me
15:23 donsizemore @xarthisius have you checked the versions endpoint without your token? http://$SERVER/api/datasets/$id/versions
15:24 donsizemore although i supposed that's published/unpublished
15:34 xarthisius donsizemore: the trick is, if our user provides API key, we plan on always sending them in the headers. We'd like to avoid a situation when we expose something that we shouldn't
15:35 xarthisius so I need some sort of indication that user only sees this resources because they provided API key
15:36 xarthisius if published datasets cannot be private, that'd be enough. We could use 'latestVersion.versionState' == "RELEASED" as a check
15:56 pdurbin xarthisius: I think what you want is ":latest-published" at http://guides.dataverse.org/en/4.15/api/native-api.html#datasets
15:57 pdurbin This is timely for me because I'm writing a talk about getting started with Dataverse APIs. All should please feel free to add ideas for my talk on this thread: https://groups.google.com/d/msg/dataverse-community/V5WkMGDS4VI/maxXTdmzDwAJ
16:08 xarthisius pdurbin: just to make it perfectly clear for me: published dataset can only be public and accessible for anyone?
16:11 pdurbin xarthisius: the metadata for published datasets (description, etc) and their files (filenames, etc.) is always public. The *content* of published files can be restricted ("request access" button instead of a "download" button) but (again), metadata about published dataset versions and their files is always public (and included in a sitemap to be easily indexed by search engines).
16:14 xarthisius ok, thank you! that helps a lot!
16:16 pdurbin xarthisius: sure, please keep the questions coming. I'm happy to further clarify. The safest thing to do would be to make the API call without a token, I guess. But I think you'll be fine. :)
16:57 icarito[m] joined #dataverse
17:19 donsizemore @pdurbin i saw :latest-published but private could also mean restricted(?). but yes any call w/o a token
17:31 pdurbin donsizemore: well, we can get further clarification from xarthisius if necessary. It's certainly a good point that one could use an API token to download restricted files (sometimes we call these data files) that should absolutely not be exposed publicly.
17:32 pdurbin donsizemore: so are you thinking the swagger thing should go in dataverse-ansible or dataverse-jenkins?
17:40 pdurbin I'm asking because I'd be happy to create an issue in either repo. Or both if you want. :)
17:48 rigelk joined #dataverse
18:45 donsizemore both, and we'll close whichever one doesn't make sense?
19:10 pdurbin donsizemore: both will force me to think about which one I want more and when. Can I add either to your column on by board? :)
19:10 pdurbin my* board
19:19 donsizemore shore
19:21 pdurbin cool
19:21 pdurbin Do you want to hear the second new crazy idea? :)
19:22 donsizemore yes
19:27 pdurbin The second idea is to have https://jenkins.dataverse.org/job/IQSS-Dataverse-PullRequest/ build the guides.
19:33 pdurbin Does that make sense?
21:16 pdurbin We can talk about it next week or whenever. :)
21:17 pdurbin Have a good weekend, everyone!
21:17 pdurbin left #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.