Time
S
Nick
Message
06:40
Virgile joined #dataverse
07:04
Virgile joined #dataverse
14:22
pdurbin joined #dataverse
15:02
pdurbin
poikilotherm: thoughts on the metadata meeting?
15:37
juancorr joined #dataverse
15:38
donsizemore joined #dataverse
16:23
pameyer joined #dataverse
16:40
poikilotherm
Hi pdurbin.
16:40
poikilotherm
I'll try to do it in a nutshell
16:44
poikilotherm
No one actually looked at the elephant in the room, asking the question how Dataverse needs to change to combine all of this ideas and efforts, away from the past being centric and brilliant at doing data publications towards what we envision for the future in terms of data, software and environment publications, combined with more sophisticated provenance tracking. If we go for extending the scope of Dataverse from
16:44
poikilotherm
data pubs to more, we need to talk about how do we provide more benefits for a researcher and help them get the job done and still go with best practices and requirements like IP laws etc.
16:46
poikilotherm
People are talking a lot about this and that metadata block, but I think we need to set back and talk about scope of Dataverse here first. So setting up use cases might be a good idea.
16:48
poikilotherm
Had a very interesting discussion with Ana after the metadata wg call. Learned a lot and I think I can say that we aren't that far away from each other as it might seem
16:58
pameyer
poikilotherm - do you know about pdb_redo?
17:02
poikilotherm
Nope, I didn't. Took a quick glimpse, but not sure what it is yet. https://pdb-redo.eu/
17:03
poikilotherm
ok I understand it is a database, but how is this connected to dataverse?
17:04
pameyer
In the context of stepping back, and taking a look at what a data repository could to do provide more benifits to researchers
17:05
pameyer
thinking of pdb_redo as a database is likely not the most useful in this context
17:06
pameyer
data repository (PDB / protein data bank) + periodic re-analysis of all published datasets (including improved software and algorithms)
17:06
pameyer
+ HPC to run it, but that's an implementation detail
17:07
poikilotherm
Folks, of to construction site now. Getting late here. Read you on my mobile.
17:07
pameyer
np - it'll keep :)
17:07
pameyer
I'll dump a bit more though
17:08
pdurbin
poikilotherm: yeah, I agree that talking about scope might help. I'm glad you had a follow up discussion with Ana.
17:08
pameyer
anyhow, if you're thinking about the future of data, software, execution environment, potentially provenance type things - not thinking about how to consider that type of use case is missing a lot (at least in my opinion)
17:09
pameyer
there's another classic crystallography paper talking about similar ideas; don't have the cite handy though - I can track it down if there's interest
17:34
poikilotherm
Pdurbin: we had both lots of input. And she was amazed by having someone with strong opinions. It seems I tend to be the opinionated German guy... :-D
17:34
poikilotherm
(Which doesn't mean I'm not open for discussion)
17:48
poikilotherm
BTW pdurbin what is holding us back from reviewing https://github.com/IQSS/dataverse/pull/7416 ? Do you feel like there is more discussion needed on this or is it just one of these pesting availability problems? Don't want to be pushy, just informed on what's going on :-D
18:13
dataverse-user joined #dataverse
18:39
donsizemore joined #dataverse
18:39
pdurbin
poikilotherm: I took a quick glance at it but I don't know much about timers. Is that one the next logical one to move now that the Payara upgrade has been merged (thanks).
18:43
poikilotherm
Yes it is. :-) thx for taking a look!
18:50
pdurbin
poikilotherm: oh, on a related note, we're starting to see some timer trouble and I don't know why. Recent changes? Let me find the issues. One sec.
18:51
pdurbin
https://github.com/IQSS/dataverse/issues/7398 and https://github.com/IQSS/dataverse.harvard.edu/issues/92
18:52
pdurbin
So I'm a little nervous if timer-related stuff is already breaking (due to recent changes?) and now another change is being proposed.
18:52
pdurbin
Perhaps your pull request will fix all timer problems. That would be nice. :)
18:53
pdurbin
Or maybe we should switch to cron.
18:53
* pdurbin
ducks
19:04
poikilotherm
Pdurbin I see the timestamps on the log messages in 7398. When was Harvard Dataverse updated to the last releases? Did that happen during those timestamps?
19:05
poikilotherm
It says it couldn't execute because the container was stopped, what sounds like an undeployed app
19:10
pdurbin
looks like nov 12 it was upgraded to 5.2
19:32
nightowl313 joined #dataverse
19:37
nightowl313
hey folks I haven't bothered you in a few days ... we are hoping to move some datasets from harvard to our dataverse (working with tech support on this)... but our researchers want to make sure the DOI/URL from harvard will still point to the dataset in the new location ... is this possible? I know there is an alternate URL field, and/or can we modify the harvard datacite record to point to the new location?
19:38
pdurbin
When you say working with tech support on this do you mean you emailed support dataverse.org?
19:39
pameyer
pdurbin: cron usually works
19:39
pameyer
nightowl313: I don't know think it's directly supported in dataverse, but you can definately retarget DOIs
19:40
pameyer
the DOI on the new landing page won't match though
19:41
nightowl313
was reading a recent community message about this ... and it seems that we would want to change the pointer on the doi in datacite, and then add the old DOI in the "alternate URL " field
19:41
nightowl313
but just wondering if this works well ... or is it better to just copy the dataset and have it in both places?
19:42
pdurbin
You could always deaccession the dataset in the old location and leave a note pointing to the new location.
19:44
pameyer
retargeting and old DOI in "alternate URL " is what would make sense to me, but I'd go with what folks with more expertise recommend
19:45
pameyer
putting on my researcher hat, having a dataset with both old and new locations as "source of truth" feels a little odd
19:46
nightowl313
reading through this: https://groups.google.com/g/dataverse-community/c/PfKIZFxFZhE/m/YmBOb36fAQAJ this probably won't be as easy as we think, huh?
19:47
pdurbin
It's not exactly drag and drop.
19:48
pdurbin
But don't be discouraged. :)
19:50
nightowl313
haha no .. just trying to determine where to start ... we have a few folks already who want their datasets in ASU's dataverse, but not necessarily out of harvard's. In that case, we would probably just want to harvest them, I guess.
19:51
nightowl313
but, if we want to move them it sounds like it will be some work, and then we need to deaccession the dataset in harvard's DV and point the DOI to the new DOI (since if we move it out of harvard we would have to deaccession it anyway, right)?
19:55
nightowl313
no matter how we move the dataset, are versions not retained at all?
19:59
nightowl313
sorry, I AM reading through all the things in that thread =)
20:05
pdurbin
I guess you don't *have* to deaccession the old one put to pameyer's point you don't want two citable things out there.
20:07
pdurbin
Moving versions is probably tricky. Maybe you could import the first version (making it a 1.0) and add additional versions? Not sure.
20:13
pameyer
versions go out in the native API , right?
20:13
pameyer
maybe native GET -> put new DV temporarilly into FAKE -> native API create -> manually set new DOI and reconfigure back to datacite?
20:14
pameyer
would not try that without testing it, and maybe more effort than it's worth
20:14
pdurbin
yeah, without testing it, it's hard to know
20:15
pdurbin
but yeah, you can pull the 1.0 version of the old dataset using the native API , a GET, like you said
20:17
nightowl313
oh right each version has it's own identifier ... i guess i have a bunch of testing to do! thanks!
20:18
pdurbin
nightowl313: after you figure it out, please write up a blog post or something! Or at least a post on the mailing list. :)
20:19
nightowl313
yes, I will do that .. if I can get it to work! =) would love to help this community in some way!
20:22
pdurbin
Can you link us to the old dataset?
20:22
pdurbin
er, the dataset in the old/current location?
20:27
nightowl313
we have about 10 that we have identified that we would like to eventually move ... we are asking the owners for permission and then will work on moving or copying them one-by-one ... here is a link to the first taker: https://urldefense.com/v3/__https:/dataverse.harvard.edu/dataverse/jamesstrickland__;!!IKRxdwAv5BmarQ!PDeBkZIV-p47VoRFBztRNm5Opnr3ZetouMOerdHYtksdM2Gh5P42-F-NdrDlAVDPgZ_p$
20:27
nightowl313
it is only 2 datasets (we will recreate the dataverse)
20:28
pdurbin
one of them has 3 versions
20:28
nightowl313
looks like there are no versions...so yay
20:28
nightowl313
oh it looked like the versions had been deaccessioned
20:29
pdurbin
ah, you're right
20:29
nightowl313
oh but maybe that was just user notes
20:29
nightowl313
either way i think it is a good one to start with .. not too complicated! lol
20:30
pdurbin
Not a lot of metadata. You could almost do it manually.
20:32
nightowl313
I actually may do that ... it seems pretty straightforward .. .not too many files ... but I also want to learn how to migrate using one of the tools so it might be a good test of that that would not be too complicated
20:33
pameyer
even with things that could be done manually, automating them usually helps me keeping the number of typos I make to a more manageable level
20:34
nightowl313
haha yes! I think I will definitely use this to learn the process ... I'm actually relieved that it is simple
20:35
pameyer
always good to start with simple things :)
20:37
nightowl313
we'll see how it goes! thank you all so much for your help!
20:40
pdurbin
good luck
20:55
nightowl313
can we keep the same DOI on the dataset (ie: the harvard DOI) and just point that DOI to the new location in datacite?
20:57
pdurbin
The problem with that is if the author wants to make a change to the dataset once its hosted at ASU. New metadata should be pushed to DataCite and you won't be able to update the Harvard-owned DOI metadata.
20:58
nightowl313
ahhh ... okay, so it would need a new DOI and the "Other ID" would be the old one ... but would still need to be pointed to the new DOI, right?
20:59
pdurbin
Either that (repoint it) or deaccession the dataset, I would think.
21:00
pameyer
pdurbin: does deaccession change the DOI target?
21:00
pdurbin
Well, I think some metadata is sent to DataCite at least.
21:01
pdurbin
It might make the record not searchable or something. I really don't know.
21:01
pameyer
me either
21:02
pdurbin
nightowl313: another option, since we're spitballin', is to leave the dataset and Harvard and harvest it into your installation so it's searchable.
21:02
pdurbin
sorry "leave the dataset at Harvard" I meant
21:03
pdurbin
But I can understand if you'd rather move it.
21:16
nightowl313
yea, we are leaning that way for most of them ... we thought it might be nice to have ASU-authored datasets in our DV, mostly to get something in there! But, I guess if we harvest them they still are identified with ASU
21:18
pdurbin
That's what UVa does for their older datasets.
21:24
nightowl313
oh good to know ... they have some on harvard DV too ... harvard has everything! We were noticing all of the journals and publicatin dataverses!
21:24
nightowl313
publication
21:28
pdurbin
:)
21:54
pdurbin left #dataverse