Time
S
Nick
Message
13:56
pdurbin joined #dataverse
15:04
pameyer joined #dataverse
15:08
nightowl313 joined #dataverse
15:13
nightowl313
hi all .. lurking at yesterday's notes ... per the conversation about OS ... i'm very curious what others are switching to ... we are currently on Centos 8 but have tested stream
15:13
nightowl313
i'm thinking we may want to switch back to centos 7 ... things were fine there
15:14
nightowl313
we also are considering ubuntu
15:14
nightowl313
as all of our other things run on ubuntu
15:15
nightowl313
but kind of waiting to see what others do =)
15:16
pameyer
nightowl313: we've had good luck with ubuntu LTS for some of our (non-dataverse) servers, but we're probably not going to move everything to it
15:17
pameyer
we're also waiting to see what others do :) but part of that is for research workstations / HPC stuff. switching back to centos 7 (or "downgrading" things that have been updated to cent8) gives more time for things to shake out
15:18
pameyer
I'm a little concerned about ubuntu's move to snap packages, but so far it hasn't impacted us for server stuff
15:19
nightowl313
yea i think for us it would be better to use what is "recommended" or at least tested and others in the community are using ... i think i will try a downgrade to centos 7 on a test instance
15:20
pameyer
I'm not sure if it matters for your site, but shibboleth is something that might need double-checking for an ubuntu switch. java/apache/postgres I'd expect to be less of an issue
15:21
pameyer
and sticking with the "recommended" platform is usually a good idea. sometimes there are reasons not to, but I usually try to stick with the recommendations
15:23
nightowl313
agreed! i guess this will be a good test of our disaster recovery process! will have to install centos 7 fresh and recreate from db dump and copied files, right? does that preserve the history/versions?
15:29
nightowl313
or no ... i guess i would just have to point to the s3 and rds with new install .. sorry thinking this all through =) have only done new installs up until now, other than testing DR
15:30
nightowl313
on another note ... does anyone know if the unc number in the citation changes when you make metadata changes?
15:30
nightowl313
i'm all over the place! =D
15:30
pdurbin
I assume you mean the UNF and no, it only changes when contents of tabular files change.
15:31
nightowl313
yes unf
15:31
nightowl313
thanks
15:31
pdurbin
UNF is a checksum of data, not metadata. :)
15:31
pameyer
testing backups is always a good thing - I think most people have had the experience at some point where "we have backups" turns into - nope, we never tested the restore and it didn't work
15:32
pameyer
good to check it before you need it
15:33
nightowl313
i did do some testing of restoring backups for our DR plan (using the magical dataverse-ansible as well!) .. it was a lot of work!
15:36
pdurbin
That's why so many shops don't do it. :)
15:36
nightowl313
but i did actually get it all to work ... brand new instance with db restored from backup and backups of s3 data copied to new s3 buckets .. i hope we never have to do that though
15:37
nightowl313
thanks for the info on the unf number ... i really had no idea what that was =)
15:39
pdurbin
Sure. For what it's worth some folks in the community aren't a big fan of UNF: https://github.com/IQSS/dataverse/issues/7328
15:40
nightowl313
oh that is interesting!
15:41
nightowl313
we've had a few requests from folks to be able to remove the version number from the citation .. or at least have more flexibility over whether a new version gets created
15:41
nightowl313
or shows on the citation .. but I know that kind of defeats the purpose of version history! =D
15:42
nightowl313
they publish something with the citation info and they don't want it to not match what they've published ... we are just trying to work with folks and make sure they don't publish until they are really ready
15:43
pdurbin
Do you know about CuratePublishedDatasetVersionCommand?
15:44
nightowl313
no ... looking it up
15:44
pdurbin
That's what it's called in code. Here are the docs: https://guides.dataverse.org/en/5.3/admin/dataverses-datasets.html#make-metadata-updates-without-changing-dataset-version
15:45
pdurbin
Good for fixing metadata typos, etc.
15:46
nightowl313
omg how did i miss that? i think i need to just read straight through the docs again from start to finish ... i do search for these things!
15:46
pdurbin
Nah, just ask. It's easier. :)
15:48
nightowl313
well, i did know that the option to republish current version would appear if it is just a minor version change ... is this a different option
15:49
nightowl313
i didn't realize that was only for superuser, though
15:49
pameyer
there could be a bit of a circular sitation - dataset citation goes into a manuscript, but can't add the manuscript citation to the dataset until it gets published
15:57
pameyer
... which appears to be completely avoided (at least for superusers) by the documentation pdurbin linked
15:58
pdurbin
hopefully
16:02
nightowl313
okay, we have used this before for just that reason (although didn't realize it was just superuser function) ... but will probably not advertise it widely =)
16:05
pdurbin
good to have in your back pocket
16:09
nightowl313
for sure! ;D
16:33
nightowl313
argh .. i tried to publish the draft and keep the current version, and i'm getting an error " Command edu.harvard.iq.dataverse.engine.command.impl.CuratePublishedDatasetVersionCommand 53aa3b failed: Cannot merge an Entity that has been removed: edu.harvard.iq.dvn.core.study."
16:38
nightowl313
and ... "Response has already been committed, and further write operations are not permitted. This may result in an IllegalStateException being triggered by the underlying application. To avoid this situation, consider adding a Rule `.when(Direction.isInbound().and(Response.isCommitted())).perform(Lifecycle.abort())`, or figure out where the response is being incorrectly committed and correct the bug in the offending code.|#]"
16:40
nightowl313
yikes .. have not made changes since update to 5.3
16:59
nightowl313
tried deleting the draft and recreating it .. same thing .. anyone know what that error might indicate? have not had any issues with publishing until now ... should i put in a support ticket?
17:17
pdurbin
nightowl313: yes, a support ticket please: support dataverse.org
17:19
nightowl313
okay i submitted one (although i used dataverse_support help.hmdc.harvard.edu, is that an old email?) .. .this does not happen on our test dataverse instance, so may be related to the dataset itself
17:20
nightowl313
i also sent a community email ... lol ... covered all bases! just worried it is system-wide ... since it is a prod dv instance I can't do a lot of testing adding new datasets
17:31
pdurbin
it looks like it came through ok: https://help.hmdc.harvard.edu/Ticket/Display.html?id=301437
17:35
nightowl313
+1 ... thanks for the link!
20:09
nightowl313 joined #dataverse
20:11
nightowl313
well, another friday question, which may or may not be related to my last question ... any idea why the metrics that appear on the dataset pages would not be updating? (ie: views, downloads, citations)?
20:11
nightowl313
If I download a file from our test instance, and refresh the page, the “# Downloads” metric increases by 1. On our prod instance, they all show 0 even after downloading a file.
20:12
nightowl313
i'm having problems today
20:13
pameyer
do both have the same MDC config?
20:13
pdurbin
For citations you have to have all the Make Data Count stuff installed.
20:15
nightowl313
they should ... i usually test everything on test and then do the same on prod, but it is possible ... i will check that .. i think i thought make data count was separate from the site stats
20:17
pameyer
I'm not sure - it seems at least remotely plausible that if one had MDC on, the other didn't, and MDC configuration made some changes to the metrics display
20:17
pameyer
but this is low confidence speculation on my part, so if there are better ideas probably worth investigating those first
20:20
nightowl313
i had make data count installed and was getting metrics, but something stopped along the say ... the main site "Downloads" is reporting okay, just not the datasets .. re-checking everything now
20:23
pdurbin
If you're using MDC, the views and downloads don't show up until you run Counter Processor on the logs. I think in the guides we suggest doing this nightly.
20:23
pdurbin
Here's the crazy diagram I made: https://guides.dataverse.org/en/5.3/_images/make-data-count.png
20:32
nightowl313
i was just looking at that .. i admit i need to spend some time understanding it better ... i had set up a cron job, but looks like it isn't running
20:35
pdurbin
Ah, that would do it.
20:35
pdurbin
Lots more moving parts with MDC than the out of the box metrics that Dataverse does.
20:36
pameyer
cron jobs not running when they should trip things up
20:37
nightowl313
it is writing daily logs
20:37
nightowl313
so that one is running1
20:38
pdurbin
Well, the logs should be written by Dataverse.
20:38
pameyer
I've been bitten a few times by cron jobs that run, have an error, and try to email it somewhere. if the system cron email isn't set right, or it gets lost in a filter, it can complicate troubleshooting
20:38
pdurbin
Then CP crunches the logs.
20:38
pdurbin
The Dataverse slurps up the result of that crunching (a SUSHI file in JSON format).
20:39
pdurbin
Then*
20:39
pameyer
I've also tripped over shell/path differences in cron jobs a time or two
20:40
nightowl313
it is the counter logs ... i can see those for every day since i set it up in october ... but i understand about the various places for adding cron jobs! =)
20:41
pdurbin
If you have the logs, you have the data. :)
20:43
nightowl313
i stopped there though and didn't go through the rest of the steps to send to datacite .. but the site should be updating shouldn't it? we have metrics on the main page, but none of the datasets show downloads or views
20:43
nightowl313
granted it doesn't get very much action yet! ;-D
20:46
nightowl313
it is the main.py that is not running nightly
20:51
pameyer
can you tell if where it's failing?
20:55
pdurbin
Yes, it should work.
20:55
pdurbin
But I'm heading out. Happy to chat more next week.
20:55
nightowl313
sorry i'm not making any sense ... it loooks like it should work, but if i upload a file to a published dataset, the "Downloads" value on the page doesn't increment ... they are all 0
20:56
pameyer
if you manually run the cron job (outside cron), does that update the counts?
20:57
nightowl313
will try that now ... was just going through everything to make sure it was still set up .. i do get counter logs everyday
20:58
pdurbin left #dataverse
21:33
nightowl313
i think it is a permission error ... i apparently originally installed it as root
21:35
nightowl313
going to start over :)
22:32
pameyer
good luck - I think I'm going to disconnect for the weekend too
22:35
nightowl313
happy weekend to all! thanks for the help!