Time
S
Nick
Message
11:57
donsizemore joined #dataverse
12:14
donsizemore
@pdurbin on your dataverse talk: as Dmitri says, fleshed-out examples of which identifiers for which endpoint would be very useful for newbies and oldies alike
12:35
pdurbin_m joined #dataverse
12:36
pdurbin_m
donsizemore: instead of database IDs for datasets? Show DOI examples throughout?
12:53
pdurbin_m
What do you mean by "identifiers"?
13:22
pdurbin
donsizemore: I have at least two new crazy ideas about API docs if you'd like to hear them. :)
13:31
donsizemore
@pdurbin hey hey, back from coffee and three minor crises
13:32
pdurbin
so four crises, really... no coffee
13:33
donsizemore
@pdurbin when cheryl and i fiddle with file upload api for instance, native vs sword vs zipupload may want dbid or persistentId, sometimes at the end, and yes consistent usage and documentation would be killer
13:33
pdurbin
Right. Are you still in favor of full curl examples throughout?
13:34
donsizemore
@pdurbin yes and i think i volunteered to do that, didn't i. but even then the usage is different for different endpoints
13:34
donsizemore
what are your not-so-crazy API doc ideas?
13:34
pdurbin
Huh. How is the usage different?
13:36
donsizemore
in http://guides.dataverse.org/en/latest/api/native-api.html#add-a-file-to-a-dataset for instance the two examples conflict
13:39
pdurbin
woof, they sure do
13:40
pdurbin
This is a case where someone wrote the short example and instead of deleting it I added a full curl example below it.
13:40
pdurbin
Should I just delete the short one? I don't think it adds much value, to be honest.
13:40
donsizemore
but that's just me being whiny. now that i ran to the restroom and thought about it,
13:41
donsizemore
the real issue for us with the API is that our archivists are on Windows. so you open up bash for Win10 and they're all "whut"
13:41
donsizemore
not a real issue, but depending on the audience for your talk, you might try to suss out their starting points early on?
13:41
pdurbin
Hmm. Does pyDataverse work on Windows?
13:42
donsizemore
i'm thinking i'd like to test that
13:42
pdurbin
I know it shells out to curl for file upload right now so I'll bet a beer that doesn't "just work" for most people on Windows. :)
13:42
pdurbin
I think I promised Stefan a pull request. :)
13:43
donsizemore
the syntax was a little different but it was last week so i don't remember exactly how. i can pester cheryl for her cheat sheet we started
13:43
pdurbin
I bet I can get it working from requests.
13:43
donsizemore
because we did get the file upload API working from her windows box, just ran into network timeouts
13:43
pdurbin
That's awesome that you got it to work.
13:43
pdurbin
Don't forget about Jim's DVUploader.
13:43
pdurbin
He just put out a new release yesterday.
13:43
donsizemore
that's a java jar, right?
13:43
pdurbin
yeah
13:44
donsizemore
equal discomfort for an archivist used to a windows desktop
13:44
pdurbin
Sure.
13:44
pdurbin
What do they want? An exe with a GUI ?
13:44
donsizemore
so, i'm just talking off the top of my head about your talk, but what's actionable from that?
13:45
donsizemore
a java GUI might be a pain in the butt but would be quite popular
13:45
donsizemore
you've already got the web interface for most of it
13:45
pdurbin
I'm going to act on a lot of this conversation. You're good. I don't need anything from you. Except that I do still have those two crazy ideas in my head and they want to escape.
13:46
donsizemore
um, i forget if we had a "full curl examples" issue, but checking the Native API page for correct, full examples against 4.15 wouldn't be a bad use of maybe an intern?
13:46
pdurbin
Yeah, I'm not sure I want to take all that on right now.
13:47
donsizemore
we have a smart info science student who's about to start for our archive group, i can try to harang her into taking that on
13:47
pdurbin
And I doubt Kevin wants to test it all. :)
13:47
donsizemore
i'd have to offer thu-mai pastries or something
13:47
pdurbin
That would be great!
13:47
donsizemore
i'll ask. that much i _can_ promise =)
13:48
pdurbin
:)
13:49
pdurbin
Ready for the first crazy idea?
13:49
donsizemore
yes yes
13:50
pdurbin
Are you familiar with Swagger?
13:51
pdurbin
It also has a newer name these days.
13:51
pdurbin
The newer name is OpenAPI.
13:51
pdurbin
Here's the issue about it: https://github.com/IQSS/dataverse/issues/5794
13:54
donsizemore
ooh, and it's in payara5
13:55
pdurbin
Maybe you see where I'm going with this. :)
14:03
donsizemore
i like it
14:03
pdurbin
I'm thinking a Jenkins job.
14:05
donsizemore
we can easily build off that branch and test its API
14:06
pdurbin
Well, it should work on any newish branch.
14:06
pdurbin
We've merged enough fixes to get the core of Dataverse to deploy to the latest Payara version.
14:07
pdurbin
Oh, you probably mean "test the /openapi" from Payara, right?
14:07
andrewSC joined #dataverse
14:14
donsizemore
excellent. i did, but what do i know
14:16
pdurbin
To be clear, what I really want for everybody is the nice HTML output from Swagger. Are you familiar with how this looks?
14:20
pdurbin
I linked to an example in the GitHub issue above.
14:20
pdurbin
A pretty old example. It might look different these days.
14:33
donsizemore
i remember it from miniverse
14:36
pdurbin
Cool. So the crazy idea is to deploy Dataverse to Payara 5. Then hit that /openapi. Then build the HTML for the Swagger docs. I don't have a strong preference for where the HTML is hosted right now. It could be hosted on an EC2 instance spun up with dataverse-ansible. Or it could be dumped somewhere on the Jenkins server. Whatever is easier.
14:44
pdurbin
Does that make sense?
15:15
xarthisius
pdurbin: Is there a way to check whether a dataset I'm accessing through the API is public or private (i.e. I can only access it cause I provided my API key) ?
15:22
donsizemore
@pdurbin sounds good to me
15:23
donsizemore
@xarthisius have you checked the versions endpoint without your token? http://$SERVER/api/datasets/$id/versions
15:24
donsizemore
although i supposed that's published/unpublished
15:34
xarthisius
donsizemore: the trick is, if our user provides API key, we plan on always sending them in the headers. We'd like to avoid a situation when we expose something that we shouldn't
15:35
xarthisius
so I need some sort of indication that user only sees this resources because they provided API key
15:36
xarthisius
if published datasets cannot be private, that'd be enough. We could use 'latestVersion.versionState' == "RELEASED" as a check
15:56
pdurbin
xarthisius: I think what you want is ":latest-published" at http://guides.dataverse.org/en/4.15/api/native-api.html#datasets
15:57
pdurbin
This is timely for me because I'm writing a talk about getting started with Dataverse APIs. All should please feel free to add ideas for my talk on this thread: https://groups.google.com/d/msg/dataverse-community/V5WkMGDS4VI/maxXTdmzDwAJ
16:08
xarthisius
pdurbin: just to make it perfectly clear for me: published dataset can only be public and accessible for anyone?
16:11
pdurbin
xarthisius: the metadata for published datasets (description, etc) and their files (filenames, etc.) is always public. The *content* of published files can be restricted ("request access" button instead of a "download" button) but (again), metadata about published dataset versions and their files is always public (and included in a sitemap to be easily indexed by search engines).
16:14
xarthisius
ok, thank you! that helps a lot!
16:16
pdurbin
xarthisius: sure, please keep the questions coming. I'm happy to further clarify. The safest thing to do would be to make the API call without a token, I guess. But I think you'll be fine. :)
16:57
icarito[m] joined #dataverse
17:19
donsizemore
@pdurbin i saw :latest-published but private could also mean restricted(?). but yes any call w/o a token
17:31
pdurbin
donsizemore: well, we can get further clarification from xarthisius if necessary. It's certainly a good point that one could use an API token to download restricted files (sometimes we call these data files) that should absolutely not be exposed publicly.
17:32
pdurbin
donsizemore: so are you thinking the swagger thing should go in dataverse-ansible or dataverse-jenkins?
17:40
pdurbin
I'm asking because I'd be happy to create an issue in either repo. Or both if you want. :)
17:48
rigelk joined #dataverse
18:45
donsizemore
both, and we'll close whichever one doesn't make sense?
19:10
pdurbin
donsizemore: both will force me to think about which one I want more and when. Can I add either to your column on by board? :)
19:10
pdurbin
my* board
19:19
donsizemore
shore
19:21
pdurbin
cool
19:21
pdurbin
Do you want to hear the second new crazy idea? :)
19:22
donsizemore
yes
19:27
pdurbin
The second idea is to have https://jenkins.dataverse.org/job/IQSS-Dataverse-PullRequest/ build the guides.
19:33
pdurbin
Does that make sense?
21:16
pdurbin
We can talk about it next week or whenever. :)
21:17
pdurbin
Have a good weekend, everyone!
21:17
pdurbin left #dataverse