IQSS logo

IRC log for #dataverse, 2019-06-05

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
01:06 jri joined #dataverse
04:06 jri joined #dataverse
07:25 jri joined #dataverse
07:32 juancorr joined #dataverse
08:04 rigelk m
08:08 rigelk sorry, I meant pdurbin : I am working on the base objects we could send over ActivityPub at https://framagit.org/synalp/olki/scifed -  I'm wondering what we should add to a basic Dataverse/Corpus object though. Something like a scientific version of https://schema.org/CreativeWork maybe?
09:42 stefankasberger joined #dataverse
09:55 pdurbin rigelk: hi! If you "view source" on a dataset in Dataverse you will see `{"@context":"http://schema.org","@type":"Dataset","@id"...` which is to say that we use https://schema.org/Dataset . Here's a dataset you can view source on and see that JSON-LD: https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/TJCLKP
09:56 rigelk Ah I see
09:56 rigelk Thanks!
09:57 pdurbin Sure. Are you familiar with Google Dataset Search? They asked us to use https://schema.org/Dataset
09:57 pdurbin this: https://toolbox.google.com/datasetsearch
09:57 rigelk I didn't know about it
09:58 pdurbin It's pretty new.
10:05 pdurbin Schema.org JSON-LD is certainly not the richest metadata format that Dataverse supports but it gets the job done for making datasets discoverable in Google Dataset Search.
10:07 rigelk pdurbin: objects on AP need to be subclasses of Object ( https://www.w3.org/TR/activitystreams-vocabulary/#dfn-object ), so we will have to have multi-type entities anyway.
10:09 rigelk ['as:Object', 'schema:Dataset'] sounds like a good start to me.
10:24 pdurbin Great! Also, I'm trying to get in touch with Michael to let him know you're thinking about this but please feel free to comment again on that Dataverse issue.
10:34 pdurbin rigelk: ah, he just left this comment: https://github.com/IQSS/dataverse/issues/5883#issuecomment-499029417 :)
11:22 rigelk ah great :)
11:44 rigelk pdurbin: schema:Dataset is nice, but I'm not sure the common object type(s) we are defining are ever going to be used for SEO - they are just inter-instance dissemination formats after all. Do you think they could be used (as in the Dataverse page you linked) for SEO?
11:47 pdurbin rigelk: well, Google is a search engine. And schema:Dataset is what they asked us to use. So Dataverse has pretty good SEO from Google Dataset Search at least. :)
11:49 pdurbin I just found this and I'm happy to see that Harvard Dataverse is mentioned: https://searchengineland.com/google-dataset-search-a-new-search-service-to-find-data-from-sciences-government-some-news-organizations-304968 :)
11:50 pdurbin "Google announced Wednesday a new specialty search feature named Dataset Search that is powered partially by the dataset schema we covered a few months ago."
11:50 pdurbin Ah, Harvard Dataverse is mentioned in this longer article too: https://www.blog.google/products/search/making-it-easier-discover-datasets/
11:50 pdurbin "You’ll see data from NASA and NOAA, as well as from academic repositories such as Harvard's Dataverse and Inter-university Consortium for Political and Social Research (ICPSR)."
11:51 pdurbin rigelk: am I misunderstanding you? This is all SEO, right? Based on schema:Dataset.
11:55 rigelk pdurbin: my point is a bit different. We are trying to settle on an inter-instance exchange vocabulary (something not related to SEO), and I'm wondering wether it should be strictly idencical or just influenced by what SEO requires to use within pages (within <script type="application/ld+json"> blocks), as there is a potential case for re-use there.
11:59 pdurbin If you think we shouldn't limit ourselves to schema:Dataset, that's fine. Dataverse is a little more focused on just datasets than other repositories like Zenodo that have all sorts of "types" or whatever. Dataverse might expand to include some sort of code type in the future but we don't have any plans to host preprints or posters or whatever else as a first class type.
12:08 pdurbin If mean, if someone installed Dataverse and loaded it up with only posters and preprints we'd probably say they're sort of using Dataverse wrong. We like to see tabular files, data coming off scientific instruments, etc.
12:09 rigelk I see.
12:09 pdurbin I mean, it's a free country. You can upload any file you like to Dataverse. But Dataverse is designed for data primarily.
12:11 donsizemore joined #dataverse
12:13 Venki18 joined #dataverse
12:13 * dzho perks up seeing talk about federation
12:14 rigelk Then I guess we have to consider Dataverse-like data (schema:Dataset) a subtype of something more generic.
12:14 poikilotherm joined #dataverse
12:14 pdurbin dzho: out of the hundreds of IRC channels you're in... you found some interesting stuff here. Great! :)
12:14 dzho I'm mostly a recurrent tourist here in #dataverse rather than being more fully immersed
12:14 dzho haha
12:15 pdurbin Venki18: hi!
12:15 dzho so, does Zenodo do anything across instances?
12:15 pdurbin rigelk: sure, maybe creative work. I don't know.
12:15 Venki18 Hi pdurbin
12:16 pdurbin dzho: I don't know. Maybe Zenodo supports OAI-PMH? I haven't looked. Also known as "harvesting" (of metadata).
12:16 poikilotherm Mornin' guys
12:16 Venki18 May I know when will be different terms of use will be implemented. Like any plans so far?
12:17 pdurbin poikilotherm: hi! I made the pull request you wanted at https://github.com/IQSS/dataverse/pull/5913 . Want to try it out? :)
12:17 dzho pdurbin: fair. I'm just wondering about the day when someone asks whether Zenodo instances and Dataverse instances federate with each other :-)
12:17 dzho not that I'm asking now of course
12:18 donsizemore @pdurbin phooey. none of UNC's datasets show up in the goog's dataset search
12:18 pdurbin Venki18: can you please read the long chat I had with thanh-thanh at http://irclog.iq.harvard.edu/dataverse/2019-05-17 ?
12:19 rigelk pdurbin: also, maybe not scratch our heads too hard :) There is no need to have just one type. Like Dataset, Review, and… yes why not CreativeWork. We just need a way to say when these are scientific or not.
12:19 donsizemore I thought I fixed our robots.txt =( also, since when does Google respect robots.txt directives?
12:19 pdurbin donsizemore: no sitemap at https://dataverse.unc.edu/sitemap.xml . You should set one up: http://guides.dataverse.org/en/4.14/installation/config.html#creating-a-sitemap-and-submitting-it-to-search-engines :)
12:20 pdurbin rigelk: sounds good
12:20 donsizemore @pdurbin i remember now. we must upgrade =(
12:21 pdurbin dzho: well, rigelk and I are dreaming about some federation using ActivityPub if you've heard of it. Please see https://github.com/IQSS/dataverse/issues/5883 (comments welcome!)
12:22 pdurbin donsizemore: I'm pretty sure someone is keeping dataverse-ansible up to date with the latest releases. :) And dataverse-kubernetes actually. Shout out to poikilotherm
12:22 * rigelk likes dreaming
12:22 donsizemore @pdurbin our next upgrade entails a move into a new vmware cluster on new storage. and we have to get our crappy new backup appliance working before then. won't happen until after the community meeting
12:23 pdurbin Even crappy backups are better than no backups. :)
12:24 donsizemore i say crappy because they continue to fail and their "engineers" can't figure out why their journaling process won't write the journal
12:24 dzho pdurbin: I have heard of it. Have you met cwebber or seen him speak at libreplanet?
12:25 pdurbin dzho: hi! We chatted at the most recent LibrePlanet and I'm a rabid fan of https://librelounge.org where he and Serge talk about ActivityPut all the time.
12:25 dzho excellent
12:25 Venki18 joined #dataverse
12:25 pdurbin donsizemore: ask to see the engineer's license.
12:26 dzho ok, answering my own question: https://developers.zenodo.org/#oai-pmh
12:26 dzho TIL OAI-PMH is a thing
12:26 Venki18 Thanks pdurbin will take a look at it.
12:26 pdurbin dzho: nice! So Zenodo and Dataverse installations can harvest from each other. Is there more than one installation of Zenodo?
12:26 donsizemore @pdurbin unless the train can take me to cancun i'm not interested
12:26 pdurbin Venki18: thanks! I'm happy to answer any questions!
12:27 donsizemore @pdurbin i do have an upgraded dataverse instance containing a recent-ish copy of our production data. suppose i used that to generate sitemap.xml then drop it into place in production?
12:27 dzho pdurbin: don't know, that's another TIL, that Zenodo is a thing. This is one of my touchstone phrases, after all: So much free software, so little time.
12:27 pdurbin an embarrassement of riches
12:28 dzho quite
12:28 poikilotherm pdurbin: did I miss a release?
12:29 * poikilotherm goes looking at #5913
12:29 dzho also, things that are not the same but of interest to academics: Zenodo & Zotero. Dataverse has got that nice portmanteau action going.
12:30 dzho anyway, I should let you all get back to it. Will be trying to keep repository federation questions in mind while I'm scrolling through my fediverse timelines.
12:31 pdurbin Dataverse supports Zotero too for what it's worth. :)
12:32 dzho but does Zotero support either JSON-LD or OAI-PMH?
12:32 dzho you see how this all spirals out so quickly :\
12:35 pdurbin dzho: I have no idea, sorry
12:36 poikilotherm pdurbin: will try later. Having a break at WissKom and writing up sth. for chat.dataverse.org
12:39 rigelk pdurbin: 'tag' property (already existing on 'as:Object') with a 'science' value (possibly using its WikiData entitiy for universal meaning)? I lack a better way of meaning an object is scientific work.
12:50 pdurbin rigelk: hmm. Would you like to start a thread about this at https://ask.cyberinfrastructure.org ?
12:51 pdurbin poikilotherm: oh good. I'm looking forward to your chat requirements. :)
12:51 pdurbin donsizemore: I'm gonna run to the gym on my way in to work but I'm hoping to pick up on Jenkins adventures. Thanks for the pull request about creating jobs from the command line!
12:52 poikilotherm pdurbin: currently, it is more of a list of what I don't like about IRC :-D
12:52 poikilotherm that kind of is a list of requirements :-D
13:00 pdurbin Maybe if you're a child of the 80s like I am you don't mind IRC so much. :)
13:00 pdurbin Ok, really hopping on my bike now. Bye!
13:05 poikilotherm Ok, that should be it for now. No more things left on my brainstack. https://github.com/IQSS/chat.dataverse.org/issues/7
13:07 pkiraly joined #dataverse
13:10 stefankasberger @pdurbin: is there an example, which metadata can be added to a file added to a dataset via an api request?
13:17 poikilotherm joined #dataverse
13:28 pdurbin_m joined #dataverse
13:29 pdurbin_m stefankasberger: yes, please see the "more detailed example" at http://guides.dataverse.org/en/4.14/api/native-api.html#add-a-file-to-a-dataset
13:29 pdurbin_m poikilotherm: great issue! Thanks!
13:29 pdurbin_m donsizemore: now I'm at the gym and I've put a picture of Glassfish 4.1 on a punching bag.
13:30 donsizemore @pdurbin_m you box? remind me never to ask pesky questions again
15:05 rigelk pdurbin: I discussed with some collegues and updated https://synalp.frama.io/olki/scifed/#dfn-audiences to detail their proposition.
15:07 rigelk sadly the `audience` property of `schema:CreativeWorks` only allows a single `schema:Audience` item as value, hence the custom property.
15:13 pdurbin rigelk: I'm a little out of my depth, to be honest but let me try to pull in our metadata expert. What you and your colleagues have done looks interesting. I would suggest posting about in the ActivityPub issue if you haven't already.
15:14 dan-drexel joined #dataverse
15:14 dan-drexel @pdurbin i'm giving the installlone more try
15:15 dan-drexel quick question  - where is scripts/r/rserve/rserve-setup.sh desicribed here http://guides.dataverse.org/en/latest/installation/prerequisites.html#id24
15:17 rigelk pdurbin: thanks! I'll post it there.
15:35 dan-drexel @pdurbin - I got dataverse to run finally!
15:35 dan-drexel It works inside the VM but for some reason it won't pass through to the host machine like it should. But good enough for now!
15:40 pdurbin dan-drexel: great! Way to stick with it.
15:40 dan-drexel Thanks!
15:41 pdurbin dan-drexel: also, we just discussed your pull request at standup. I thought I heard "merge that mofo". :)
15:41 dan-drexel Hahaha
15:48 pdurbin dan-drexel: Leonid left a comment (sort of a question) about half an hour ago in the pull request: https://github.com/IQSS/dataverse/pull/5906
15:49 pdurbin And I just let a comment on the issue: https://github.com/IQSS/dataverse/issues/5905#issuecomment-499141065
15:55 pdurbin pkiraly: welcome back. I hope my reply about shib and affiliation made sense: https://groups.google.com/d/msg/dataverse-community/7FwrzfIQZfY/4p5A3VFIBgAJ
16:07 pdurbin Ah, he just replied. Good. :)
16:20 pdurbin dan-drexel: is that dyaw-Drexel I see at https://github.com/IQSS/dataverse/graphs/contributors ? Thanks again!
17:28 jri joined #dataverse
17:37 jri joined #dataverse
18:32 donsizemore joined #dataverse
19:58 pdurbin donsizemore: ERROR: anonymous is missing the Job/Create permission
20:00 * pdurbin runs java -jar /opt/jenkins-cli.jar --help
20:10 pdurbin donsizemore: I think I need some plugins
20:29 pdurbin I just opened https://github.com/IQSS/dataverse-jenkins/pull/6

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.