IQSS logo

IRC log for #dataverse, 2020-08-07

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
10:59 donsizemore joined #dataverse
11:00 donsizemore @pameyer bringing compute to data was a big push in Discovery Environment - in their architecture they accomplished this over iRODS with Condor ClassAds
14:35 pdurbin I just posted some thoughts on that thread: https://groups.google.com/g/dataverse-big-data/c/FC7uF5uL7RY/m/SXS7dawDAwAJ
14:36 pdurbin And I highly recommend watching the Dataverse/Globus integration (via their app, Synapse) video: https://youtu.be/VYq8Fr_3dhU
15:22 pameyer joined #dataverse
15:24 pameyer @donsizemore - sounds interesting; the more compute the better :)
15:25 pameyer @pdurbin - you definately do a better job of community wrangling than me ;)
15:26 pdurbin pameyer: ha! I'm quite grateful you posted yesterday. I tried to post but it didn't go through.
15:26 pameyer I hope I didn't sound too discouraging in that thread
15:26 pdurbin not at all
15:28 pdurbin did you watch the video yet?
15:28 pameyer nope, not yet
15:29 pameyer I have a weird aversion to videos when I'm trying to keep focused on stuff
15:32 pdurbin well, I'm glad you tolerate IRC
15:33 pameyer my irc client doesn't require headphones/audio ;)
15:33 pdurbin In the video I don't believe he demos file hierarchy.
15:33 pdurbin so now I'm wondering, since you brought it up
15:34 pameyer storage ids and in-place compute don't seem to play nicely together
15:37 pdurbin How does in-place compute work with S3?
15:38 pameyer not sure - I don't know enough about how that dataverse storage driver works
15:38 pameyer I'd guess about the same, but that's only a guess
15:38 pdurbin Oh, sorry, I mean outside the context of Dataverse. I'm sure compute happens on AWS.
15:39 pameyer ah, gotcha
15:40 pdurbin maybe S3 gets mounted as a POSIX filesystem or something
15:40 pameyer my (limited) understanding is that POSIX mounts of S3 hurt performance a reasonable amount
15:41 pdurbin sure
15:41 pdurbin so probably people look for tools that support S3, spark or whatever
15:41 pameyer if you're pipelines can talk s3 directly, I think it uses bucket addresses/labels/etc
15:42 pdurbin probably
19:14 pdurbin I'm scared. On my new laptop, echo $SHELL shows /bin/zsh
19:15 donsizemore trade you
19:17 pdurbin heh, suddenly I feel less scared
19:20 pameyer pdurbin: csh is probably still there...
19:21 pdurbin probably, I was using bash before, on my old mac laptop. this new one is also a mac
19:22 pameyer yeah - they switched the default to zsh with 10.15
19:22 pameyer either because they don't like gpu, or they like annoying me
19:22 pameyer gpl, I mean
19:23 pdurbin gotcha, I thought I heard this is coming. I'm running 10.14 on the old one
19:25 donsizemore https://www.howtogeek.com/wp-content/uploads/2019/10/ximg_5da79219c40ee.jpg.pagespeed.gp+jp+jw+pj+ws+js+rj+rp+rw+ri+cp+md.ic.4V1XjxAqkP.jpg
19:25 donsizemore may I suggest /bin/false
19:28 pdurbin :)
19:30 pdurbin donsizemore: ah, thanks for showing me that advanced options exist
19:31 pameyer they dropped 32-bit executables too
19:32 mateolan joined #dataverse
19:33 mateolan Hi dataverse folks--we're looking for a SPARQL wrapper for dataverses--or is there some way we can leverage domain specific ontologies for datasets within Dataverse?
19:33 mateolan does such a thing exist--or if not, is it on a product roadmap?
19:35 pameyer I'm not authorative, but as far as I know there's isn't a SPARQL interface (or rdf exporter, for that matter)
19:38 pameyer I've had reasonable luck with the dataverse search APIs for going into metadata - but that might be different than what you're thinking for leveraging ontologies
19:45 mateolan right--ultimately, I'd like to be able to reason over metadata, match column headers to ontology terms, and extract specific rows...withuot having to return a whole dataset...envisioning writing an interface to be able to do ontology annotation suggestions with some human curation...
19:45 mateolan seems like that would open up the dataverse in a big way
19:45 pdurbin mateolan: there is some subsetting functionality via API if that helps.
19:46 mateolan ooh lala @pdurbin, that sounds like a good place to start
19:46 pdurbin You can look for "subset" at http://guides.dataverse.org/en/4.20/api/dataaccess.html . Some external tools make use of it, I believe.
19:48 pdurbin I'd suggest starting a thread on the mailing list. Tell your story, etc. :)
19:48 pdurbin You're very welcome to chat here too, of course, but there are only a dozen people here. There are many more on the mailing list.
19:50 mateolan joined #dataverse
19:52 mateolan I'm only finding installation and security email addresses...and a basic twitter handle--guessing there is a specific developer email list?
19:53 pdurbin Well, there is a dev list but your story, your use cases, would be better told on the main list: https://groups.google.com/g/dataverse-community
19:54 mateolan ah, got it, thanks
19:55 pdurbin Sure. Here's the most recent thread on SPARQL, by the way: https://groups.google.com/g/dataverse-community/c/X004wd9ZBKM/m/PYdzYcGJAAAJ
19:56 pdurbin And it looks like SPARQL came up in a community call in May: https://docs.google.com/document/d/1Ve064T5ZnNMxOKLlP9Vu0I1o_pFtcN0vXXgbLXWekv8/edit
19:57 pdurbin But when I hear "domain specific ontologies for datasets within Dataverse" I think of custom metadata blocks: http://guides.dataverse.org/en/4.20/admin/metadatacustomization.html
19:58 dataverse-user joined #dataverse
20:01 pameyer me too - but I tend to forget most things csv or ddi related
20:04 pdurbin pameyer: is your domain specific ontology structural biology? Is that how you think of it?
20:17 pdurbin mateolan: very nice post! https://groups.google.com/g/dataverse-community/c/9T-2YO3czBI/m/rkLo0R21AwAJ
20:18 mateolan thanks pdurbin
20:19 pdurbin Often I try to reply right away but since we talked here, I'll let it sit and hope someone takes the bait. :)
20:20 pdurbin When I think about Dataverse installations that deal a lot with food, this one comes to mind: https://data.cimmyt.org
20:21 mateolan thanks for teh pointer
20:22 mateolan looks cool, is that a white-labeled dataverse?
20:34 pameyer pdurbin: when I think domain specific ontologies I tend to immediately veer off into blue sky about reproducability and probabilistic logic
20:35 pdurbin good
21:00 pdurbin Ok, folks. I'm out. Going on vacation for two weeks. See you on the other side. Have a great weekend!
21:00 pdurbin left #dataverse
21:04 pameyer have a good vacation!

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.