IQSS logo

IRC log for #dataverse, 2020-11-19

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
06:40 Virgile joined #dataverse
07:04 Virgile joined #dataverse
14:22 pdurbin joined #dataverse
15:02 pdurbin poikilotherm: thoughts on the metadata meeting?
15:37 juancorr joined #dataverse
15:38 donsizemore joined #dataverse
16:23 pameyer joined #dataverse
16:40 poikilotherm Hi pdurbin.
16:40 poikilotherm I'll try to do it in a nutshell
16:44 poikilotherm No one actually looked at the elephant in the room, asking the question how Dataverse needs to change to combine all of this ideas and efforts, away from the past being centric and brilliant at doing data publications towards what we envision for the future in terms of data, software and environment publications, combined with more sophisticated provenance tracking. If we go for extending the scope of Dataverse from
16:44 poikilotherm data pubs to more, we need to talk about how do we provide more benefits for a researcher and help them get the job done and still go with best practices and requirements like IP laws etc.
16:46 poikilotherm People are talking a lot about this and that metadata block, but I think we need to set back and talk about scope of Dataverse here first. So setting up use cases might be a good idea.
16:48 poikilotherm Had a very interesting discussion with Ana after the metadata wg call. Learned a lot and I think I can say that we aren't that far away from each other as it might seem
16:58 pameyer poikilotherm - do you know about pdb_redo?
17:02 poikilotherm Nope, I didn't. Took a quick glimpse, but not sure what it is yet. https://pdb-redo.eu/
17:03 poikilotherm ok I understand it is a database, but how is this connected to dataverse?
17:04 pameyer In the context of stepping back, and taking a look at what a data repository could to do provide more benifits to researchers
17:05 pameyer thinking of pdb_redo as a database is likely not the most useful in this context
17:06 pameyer data repository (PDB / protein data bank) + periodic re-analysis of all published datasets (including improved software and algorithms)
17:06 pameyer + HPC to run it, but that's an implementation detail
17:07 poikilotherm Folks, of to construction site now. Getting late here. Read you on my mobile.
17:07 pameyer np - it'll keep :)
17:07 pameyer I'll dump a bit more though
17:08 pdurbin poikilotherm: yeah, I agree that talking about scope might help. I'm glad you had a follow up discussion with Ana.
17:08 pameyer anyhow, if you're thinking about the future of data, software, execution environment, potentially provenance type things - not thinking about how to consider that type of use case is missing a lot (at least in my opinion)
17:09 pameyer there's another classic crystallography paper talking about similar ideas; don't have the cite handy though - I can track it down if there's interest
17:34 poikilotherm Pdurbin: we had both lots of input. And she was amazed by having someone with strong opinions. It seems I tend to be the opinionated German guy... :-D
17:34 poikilotherm (Which doesn't mean I'm not open for discussion)
17:48 poikilotherm BTW pdurbin what is holding us back from reviewing https://github.com/IQSS/dataverse/pull/7416 ? Do you feel like there is more discussion needed on this or is it just one of these pesting availability problems? Don't want to be pushy, just informed on what's going on :-D
18:13 dataverse-user joined #dataverse
18:39 donsizemore joined #dataverse
18:39 pdurbin poikilotherm: I took a quick glance at it but I don't know much about timers. Is that one the next logical one to move now that the Payara upgrade has been merged (thanks).
18:43 poikilotherm Yes it is. :-) thx for taking a look!
18:50 pdurbin poikilotherm: oh, on a related note, we're starting to see some timer trouble and I don't know why. Recent changes? Let me find the issues. One sec.
18:51 pdurbin https://github.com/IQSS/dataverse/issues/7398 and https://github.com/IQSS/dataverse.harvard.edu/issues/92
18:52 pdurbin So I'm a little nervous if timer-related stuff is already breaking (due to recent changes?) and now another change is being proposed.
18:52 pdurbin Perhaps your pull request will fix all timer problems. That would be nice. :)
18:53 pdurbin Or maybe we should switch to cron.
18:53 * pdurbin ducks
19:04 poikilotherm Pdurbin I see the timestamps on the log messages in 7398. When was Harvard Dataverse updated to the last releases? Did that happen during those timestamps?
19:05 poikilotherm It says it couldn't execute because the container was stopped, what sounds like an undeployed app
19:10 pdurbin looks like nov 12 it was upgraded to 5.2
19:32 nightowl313 joined #dataverse
19:37 nightowl313 hey folks I haven't bothered you in a few days ... we are hoping to move some datasets from harvard to our dataverse (working with tech support on this)... but our researchers want to make sure the DOI/URL from harvard will still point to the dataset in the new location ... is this possible? I know there is an alternate URL field, and/or can we modify the harvard datacite record to point to the new location?
19:38 pdurbin When you say working with tech support on this do you mean you emailed support@dataverse.org?
19:39 pameyer pdurbin: cron usually works
19:39 pameyer nightowl313: I don't know think it's directly supported in dataverse, but you can definately retarget DOIs
19:40 pameyer the DOI on the new landing page won't match though
19:41 nightowl313 was reading a recent community message about this ... and it seems that we would want to change the pointer on the doi in datacite, and then add the old DOI in the "alternate URL" field
19:41 nightowl313 but just wondering if this works well ... or is it better to just copy the dataset and have it in both places?
19:42 pdurbin You could always deaccession the dataset in the old location and leave a note pointing to the new location.
19:44 pameyer retargeting and old DOI in "alternate URL" is what would make sense to me, but I'd go with what folks with more expertise recommend
19:45 pameyer putting on my researcher hat, having a dataset with both old and new locations as "source of truth" feels a little odd
19:46 nightowl313 reading through this: https://groups.google.com/g/dataverse-community/c/PfKIZFxFZhE/m/YmBOb36fAQAJ   this probably won't be as easy as we think, huh?
19:47 pdurbin It's not exactly drag and drop.
19:48 pdurbin But don't be discouraged. :)
19:50 nightowl313 haha no .. just trying to determine where to start ... we have a few folks already who want their datasets in ASU's dataverse, but not necessarily out of harvard's. In that case, we would probably  just want to harvest them, I guess.
19:51 nightowl313 but, if we want to move them it sounds like it will be some work, and then we need to deaccession the dataset in harvard's DV and point the DOI to the new DOI (since if we move it out of harvard we would have to deaccession it anyway, right)?
19:55 nightowl313 no matter how we move the dataset, are versions not retained at all?
19:59 nightowl313 sorry, I AM reading through all the things in that thread =)
20:05 pdurbin I guess you don't *have* to deaccession the old one put to pameyer's point you don't want two citable things out there.
20:07 pdurbin Moving versions is probably tricky. Maybe you could import the first version (making it a 1.0) and add additional versions? Not sure.
20:13 pameyer versions go out in the native API, right?
20:13 pameyer maybe native GET -> put new DV temporarilly into FAKE -> native API create -> manually set new DOI and reconfigure back to datacite?
20:14 pameyer would not try that without testing it, and maybe more effort than it's worth
20:14 pdurbin yeah, without testing it, it's hard to know
20:15 pdurbin but yeah, you can pull the 1.0 version of the old dataset using the native API, a GET, like you said
20:17 nightowl313 oh right each version has it's own identifier ... i guess i have a bunch of testing to do! thanks!
20:18 pdurbin nightowl313: after you figure it out, please write up a blog post or something! Or at least a post on the mailing list. :)
20:19 nightowl313 yes, I will do that .. if I can get it to work! =) would love to help this community in some way!
20:22 pdurbin Can you link us to the old dataset?
20:22 pdurbin er, the dataset in the old/current location?
20:27 nightowl313 we have about 10 that we have identified that we would like to eventually move ... we are asking the owners for permission and then will work on moving or copying them one-by-one ... here is a link to the first taker: https://urldefense.com/v3/__https:/dataverse.harvard.edu/dataverse/jamesstrickland__;!!IKRxdwAv5BmarQ!PDeBkZIV-p47VoRFBztRNm5Opnr3ZetouMOerdHYtksdM2Gh5P42-F-NdrDlAVDPgZ_p$
20:27 nightowl313 it is only 2 datasets (we will recreate the dataverse)
20:28 pdurbin one of them has 3 versions
20:28 nightowl313 looks like there are no versions...so yay
20:28 nightowl313 oh it looked like the versions had been deaccessioned
20:29 pdurbin ah, you're right
20:29 nightowl313 oh but maybe that was just user notes
20:29 nightowl313 either way i think it is a good one to start with .. not too complicated! lol
20:30 pdurbin Not a lot of metadata. You could almost do it manually.
20:32 nightowl313 I actually may do that ... it seems pretty straightforward .. .not too many files ... but I also want to learn how to migrate using one of the tools so it might be a good test of that that would not be too complicated
20:33 pameyer even with things that could be done manually, automating them usually helps me keeping the number of typos I make to a more manageable level
20:34 nightowl313 haha yes! I think I will definitely use this to learn the process ... I'm actually relieved that it is simple
20:35 pameyer always good to start with simple things :)
20:37 nightowl313 we'll see how it goes! thank you all so much for your help!
20:40 pdurbin good luck
20:55 nightowl313 can we keep the same DOI on the dataset (ie: the harvard DOI) and just point that DOI to the new location in datacite?
20:57 pdurbin The problem with that is if the author wants to make a change to the dataset once its hosted at ASU. New metadata should be pushed to DataCite and you won't be able to update the Harvard-owned DOI metadata.
20:58 nightowl313 ahhh ... okay, so it would need a new DOI and the "Other ID" would be the old one ... but would still need to be pointed to the new DOI, right?
20:59 pdurbin Either that (repoint it) or deaccession the dataset, I would think.
21:00 pameyer pdurbin: does deaccession change the DOI target?
21:00 pdurbin Well, I think some metadata is sent to DataCite at least.
21:01 pdurbin It might make the record not searchable or something. I really don't know.
21:01 pameyer me either
21:02 pdurbin nightowl313: another option, since we're spitballin', is to leave the dataset and Harvard and harvest it into your installation so it's searchable.
21:02 pdurbin sorry "leave the dataset at Harvard" I meant
21:03 pdurbin But I can understand if you'd rather move it.
21:16 nightowl313 yea, we are leaning that way for most of them ... we thought it might be nice to have ASU-authored datasets in our DV, mostly to get something in there! But, I guess if we harvest them they still are identified with ASU
21:18 pdurbin That's what UVa does for their older datasets.
21:24 nightowl313 oh good to know ... they have some on harvard DV too ... harvard has everything! We were noticing all of the journals and publicatin dataverses!
21:24 nightowl313 publication
21:28 pdurbin :)
21:54 pdurbin left #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.