IQSS logo

IRC log for #dataverse, 2021-02-09

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
07:29 Virgile joined #dataverse
07:34 Virgile joined #dataverse
08:56 juancorr joined #dataverse
09:18 Virgile joined #dataverse
11:16 Virgile joined #dataverse
12:20 Virgile joined #dataverse
14:41 donsizemore joined #dataverse
14:57 pdurbin joined #dataverse
14:58 pdurbin bjonnh bricas donsizemore juancorr poikilotherm2 Virgile: community call starting in a few minutes (including a demo of Data Explorer v2): https://dataverse.org/community-calls
14:59 pdurbin poikilotherm2: I'm afraid to turn on my video.
14:59 poikilotherm2 Oh?
14:59 poikilotherm2 Why?
15:00 pdurbin maybe I'll look tired again :)
15:00 poikilotherm2 I support you :-D
15:00 poikilotherm2 I turned on my cam too :-D
15:02 pdurbin nice
15:20 poikilotherm2 pdurbin: don't worry any longer - you look just fine and perfectly awake!
15:20 donsizemore let me get a haircut and let my gym reopen and i'll think about it
15:22 pdurbin On a Zoom call yesterday I was told my hair is getting long. It's true. I'm pretty desperate for a haircut. :)
15:22 poikilotherm2 Same except for the gym part :-D
16:12 Virgile hey there - sorry I had another meeting... Have a nice day!
20:58 poikilotherm2 Oi pdurbin
20:58 poikilotherm2 U there?
20:58 poikilotherm2 Q?
21:00 pdurbin yep
21:00 poikilotherm2 Ok here we go. During today's meeting, the point of schema validation for JSON dataset ingest etc popped up
21:01 pdurbin yeah
21:01 poikilotherm2 Do you think it's time to create an issue to track this and the different approaches and ideas that exist for this?
21:02 pdurbin Let me check if there's already an issue.
21:02 poikilotherm2 I know that Slava has ideas, Stefan has some and I have some in mY head, too. All from different perspectives etc
21:02 poikilotherm2 And I'm sure you folks out there have more of 'em
21:02 pdurbin created by you: https://github.com/IQSS/dataverse/issues/7173
21:02 poikilotherm2 No! Omg I forgot 😂
21:02 poikilotherm2 Too many issues
21:03 pdurbin This must mean you really want it.
21:03 poikilotherm2 Well as you and me know what project is ahead, this might hit us when doing automated depositing
21:04 pdurbin ah, nice
21:05 poikilotherm2 When a machine is reading metadata from a file, we need to make sure that we have everything the Dataverse collection where we want do deposit to obliges us to have in terms of required metadata
21:06 pdurbin Yeah, that's what this is about: https://github.com/IQSS/dataverse/issues/3060
21:06 poikilotherm2 Of course we could just try to deposit, but the machine would even have trouble creating the JSON dataset
21:07 poikilotherm2 Oh great
21:07 poikilotherm2 Lemme link from my issue to this one
21:07 pdurbin This repo is related too: https://github.com/IQSS/json-schema-test
21:08 poikilotherm2 I wonder if it's time not only to talk about making some schema available to create the datasets
21:08 pdurbin "I was wondering if there is a JSON schema or some doco to assist us in translating the DDI 2.5 to JSON format?" https://groups.google.com/g/dataverse-community/c/qsY8swD9Hh0/m/jOAI9UBLAAAJ
21:08 poikilotherm2 But what if we use Json schema to define metadata schemas?
21:09 poikilotherm2 TSV served us for a long time, but maybe it's time to think about refactoring?
21:10 pdurbin Yeah, to me, this issue is about refactoring away from TSVs: https://github.com/IQSS/dataverse/issues/4451
21:11 poikilotherm2 Should we bring this to the table for a metadata working group meeting?
21:12 pdurbin It certainly seems on topic for metadata meetings.
21:12 poikilotherm2 Maybe do a little presentation to show what ideas people already had and documented in issues
21:12 pdurbin How interested are you in more metadata at the file level?
21:13 poikilotherm2 Not very much... 🤠
21:13 poikilotherm2 We don't do files. Primarily serving as a registry
21:14 pdurbin Heh. Ok. For a while I was thinking that adding more metadata at the file level would be an opportunity to rethink metadata blocks. To try a newer, better system there. And then apply it to the dataset level if it works out.
21:14 poikilotherm2 And most people around here are at the very early steps of rdm
21:14 poikilotherm2 Er
21:15 poikilotherm2 Maybe I got you wrong
21:15 poikilotherm2 Metadata in files or metadata about files?
21:15 pdurbin I mean like EXIF data for photos or whatever.
21:15 pdurbin But it would need to be generic.
21:15 poikilotherm2 So more metadata about the content of files
21:16 pdurbin Yeah. Some file formats are quite rich.
21:16 poikilotherm2 Sure
21:16 pdurbin We do a fair about with Stata files, for example.
21:17 poikilotherm2 Some of those might benefit when Dataverse Software writes metadata from the dataset into them
21:17 poikilotherm2 Like shipping authors, dois, license etc
21:17 poikilotherm2 When reusing the file it gets easier to track where it came from etc etc
21:18 poikilotherm2 But you wanted to read metadata from the files and present on the dataset level, right?
21:19 pdurbin It probably depends on the file.
21:19 pdurbin For FITS files in astronomy, we take metadata from the file and put it in the dataset.
21:20 poikilotherm2 Nice. Sounds a bit like a trade-off between full text index and dataset metadata only
21:20 poikilotherm2 Might hit the sweet spot
21:21 pdurbin Anyway, every once in a while someone says, "We should have more metadata at the file level! With custom metadata blocks for each file type!"
21:23 poikilotherm2 Uh fancy
21:23 poikilotherm2 Ok I see that would require rethinking how metadata blocks work
21:24 pdurbin It's been a while since anyone has gotten exited about this.
21:24 pdurbin Interest for this or that feature comes in waves. :)
21:24 pdurbin like JSON Schema :)
21:27 poikilotherm2 We should take care that we don't miss adding more next generation repository technologies 😎
21:29 pdurbin During the call, when SAS support came up I was thinking that there's an endless list of ideas from our users. It's a good thing. They want the platform to get better. They're sticking with us. :)
21:30 poikilotherm2 Hell yeah 💪
21:31 pdurbin At least SAS support is something that could be worked on by contributors. That code is fairly modularized.
21:32 poikilotherm2 To be honest I have no idea what that SAS is...
21:32 poikilotherm2 Got a link for me?
21:34 pdurbin It's a stats package (like Stata): https://en.wikipedia.org/wiki/SAS_(software)
21:34 poikilotherm2 Ok. Higy domain specific, but I see why this matters in terms of adding the metadata
21:35 poikilotherm2 We might have to think about the metadata storage though
21:36 pdurbin You're tempting me to go look at your board to see what's in focus. :)
21:36 poikilotherm2 Oh don't rely on it
21:36 poikilotherm2 I barely update it
21:36 poikilotherm2 Just stupid ideas floating in my head while doing construction work
21:36 pdurbin lol
21:37 pdurbin Any thoughts on disabling users? That's what I'm working on.
21:37 poikilotherm2 Did you know you can merge/combine json schemas?
21:37 poikilotherm2 Disabling users...
21:37 poikilotherm2 Well sounds like adding a flag
21:37 poikilotherm2 Disabled/enabled
21:38 pdurbin Yeah, I added a boolean called "disabled" to the authenticateduser table.
21:40 poikilotherm2 Maybe someday I find the time to extend the OIDC support that might use that flag...
21:41 pdurbin nice
21:56 poikilotherm2 I posted to general on the community slack, let's see what people think
21:57 pdurbin Yeah, I saw. If you're volunteering to lead the discussion, you should probably say so. :)
22:01 pdurbin I'd happily attend, at least.
22:01 pdurbin Anyway, heading out. Have a good night, all.
22:01 pdurbin left #dataverse
22:02 poikilotherm2 Good night pdurbin

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.