Time
S
Nick
Message
10:59
donsizemore joined #dataverse
11:00
donsizemore
@pameyer bringing compute to data was a big push in Discovery Environment - in their architecture they accomplished this over iRODS with Condor ClassAds
14:35
pdurbin
I just posted some thoughts on that thread: https://groups.google.com/g/dataverse-big-data/c/FC7uF5uL7RY/m/SXS7dawDAwAJ
14:36
pdurbin
And I highly recommend watching the Dataverse/Globus integration (via their app, Synapse) video: https://youtu.be/VYq8Fr_3dhU
15:22
pameyer joined #dataverse
15:24
pameyer
@donsizemore - sounds interesting; the more compute the better :)
15:25
pameyer
@pdurbin - you definately do a better job of community wrangling than me ;)
15:26
pdurbin
pameyer: ha! I'm quite grateful you posted yesterday. I tried to post but it didn't go through.
15:26
pameyer
I hope I didn't sound too discouraging in that thread
15:26
pdurbin
not at all
15:28
pdurbin
did you watch the video yet?
15:28
pameyer
nope, not yet
15:29
pameyer
I have a weird aversion to videos when I'm trying to keep focused on stuff
15:32
pdurbin
well, I'm glad you tolerate IRC
15:33
pameyer
my irc client doesn't require headphones/audio ;)
15:33
pdurbin
In the video I don't believe he demos file hierarchy.
15:33
pdurbin
so now I'm wondering, since you brought it up
15:34
pameyer
storage ids and in-place compute don't seem to play nicely together
15:37
pdurbin
How does in-place compute work with S3?
15:38
pameyer
not sure - I don't know enough about how that dataverse storage driver works
15:38
pameyer
I'd guess about the same, but that's only a guess
15:38
pdurbin
Oh, sorry, I mean outside the context of Dataverse. I'm sure compute happens on AWS.
15:39
pameyer
ah, gotcha
15:40
pdurbin
maybe S3 gets mounted as a POSIX filesystem or something
15:40
pameyer
my (limited) understanding is that POSIX mounts of S3 hurt performance a reasonable amount
15:41
pdurbin
sure
15:41
pdurbin
so probably people look for tools that support S3, spark or whatever
15:41
pameyer
if you're pipelines can talk s3 directly, I think it uses bucket addresses/labels/etc
15:42
pdurbin
probably
19:14
pdurbin
I'm scared. On my new laptop, echo $SHELL shows /bin/zsh
19:15
donsizemore
trade you
19:17
pdurbin
heh, suddenly I feel less scared
19:20
pameyer
pdurbin: csh is probably still there...
19:21
pdurbin
probably, I was using bash before, on my old mac laptop. this new one is also a mac
19:22
pameyer
yeah - they switched the default to zsh with 10.15
19:22
pameyer
either because they don't like gpu, or they like annoying me
19:22
pameyer
gpl, I mean
19:23
pdurbin
gotcha, I thought I heard this is coming. I'm running 10.14 on the old one
19:25
donsizemore
https://www.howtogeek.com/wp-content/uploads/2019/10/ximg_5da79219c40ee.jpg.pagespeed.gp+jp+jw+pj+ws+js+rj+rp+rw+ri+cp+md.ic.4V1XjxAqkP.jpg
19:25
donsizemore
may I suggest /bin/false
19:28
pdurbin
:)
19:30
pdurbin
donsizemore: ah, thanks for showing me that advanced options exist
19:31
pameyer
they dropped 32-bit executables too
19:32
mateolan joined #dataverse
19:33
mateolan
Hi dataverse folks--we're looking for a SPARQL wrapper for dataverses--or is there some way we can leverage domain specific ontologies for datasets within Dataverse?
19:33
mateolan
does such a thing exist--or if not, is it on a product roadmap?
19:35
pameyer
I'm not authorative, but as far as I know there's isn't a SPARQL interface (or rdf exporter, for that matter)
19:38
pameyer
I've had reasonable luck with the dataverse search APIs for going into metadata - but that might be different than what you're thinking for leveraging ontologies
19:45
mateolan
right--ultimately, I'd like to be able to reason over metadata, match column headers to ontology terms, and extract specific rows...withuot having to return a whole dataset...envisioning writing an interface to be able to do ontology annotation suggestions with some human curation...
19:45
mateolan
seems like that would open up the dataverse in a big way
19:45
pdurbin
mateolan: there is some subsetting functionality via API if that helps.
19:46
mateolan
ooh lala @pdurbin, that sounds like a good place to start
19:46
pdurbin
You can look for "subset" at http://guides.dataverse.org/en/4.20/api/dataaccess.html . Some external tools make use of it, I believe.
19:48
pdurbin
I'd suggest starting a thread on the mailing list. Tell your story, etc. :)
19:48
pdurbin
You're very welcome to chat here too, of course, but there are only a dozen people here. There are many more on the mailing list.
19:50
mateolan joined #dataverse
19:52
mateolan
I'm only finding installation and security email addresses...and a basic twitter handle--guessing there is a specific developer email list?
19:53
pdurbin
Well, there is a dev list but your story, your use cases, would be better told on the main list: https://groups.google.com/g/dataverse-community
19:54
mateolan
ah, got it, thanks
19:55
pdurbin
Sure. Here's the most recent thread on SPARQL, by the way: https://groups.google.com/g/dataverse-community/c/X004wd9ZBKM/m/PYdzYcGJAAAJ
19:56
pdurbin
And it looks like SPARQL came up in a community call in May: https://docs.google.com/document/d/1Ve064T5ZnNMxOKLlP9Vu0I1o_pFtcN0vXXgbLXWekv8/edit
19:57
pdurbin
But when I hear "domain specific ontologies for datasets within Dataverse" I think of custom metadata blocks: http://guides.dataverse.org/en/4.20/admin/metadatacustomization.html
19:58
dataverse-user joined #dataverse
20:01
pameyer
me too - but I tend to forget most things csv or ddi related
20:04
pdurbin
pameyer: is your domain specific ontology structural biology? Is that how you think of it?
20:17
pdurbin
mateolan: very nice post! https://groups.google.com/g/dataverse-community/c/9T-2YO3czBI/m/rkLo0R21AwAJ
20:18
mateolan
thanks pdurbin
20:19
pdurbin
Often I try to reply right away but since we talked here, I'll let it sit and hope someone takes the bait. :)
20:20
pdurbin
When I think about Dataverse installations that deal a lot with food, this one comes to mind: https://data.cimmyt.org
20:21
mateolan
thanks for teh pointer
20:22
mateolan
looks cool, is that a white-labeled dataverse?
20:34
pameyer
pdurbin: when I think domain specific ontologies I tend to immediately veer off into blue sky about reproducability and probabilistic logic
20:35
pdurbin
good
21:00
pdurbin
Ok, folks. I'm out. Going on vacation for two weeks. See you on the other side. Have a great weekend!
21:00
pdurbin left #dataverse
21:04
pameyer
have a good vacation!