IQSS logo

IRC log for #dataverse, 2019-01-03

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
05:32 andrewSC joined #dataverse
07:51 poikilotherm joined #dataverse
08:29 pmauduit pdurbin: ok thanks for the info ! I'll dig in both directories pointed by pameyer yesterday (conf/docker and conf/docker-aio), as my current plan is to get the big picture
08:29 pmauduit so, leaving a litlle bit wildfly for now
08:52 poikilotherm Hi pmauduit :-)
08:53 poikilotherm I read in the logs that you are trying things with Docker
08:54 poikilotherm Maybe https://github.com/IQSS/dataverse/issues/5373 and https://github.com/IQSS/dataverse/issues/5292 are interesting for you :-)
08:55 poikilotherm If you would like to join forces, you're welcome... ;-)
09:10 pmauduit sure ;-)
09:11 pmauduit I shall confess that I'm very new to J2EE / wildfly / glassfish
09:12 pmauduit but interesting to see how the whole stack has been built
09:32 poikilotherm Experience with docker?
09:33 poikilotherm I am targeting dev first, then prod is easy... ;-)
09:36 pmauduit yes I do
09:41 pmauduit almost ... :)
09:41 pmauduit dataverse_1  | Oops, haven't been able to connect to the database dvndb,
09:41 pmauduit dataverse_1  | running on dataverse-postgresql-0.dat​averse-postgresql-service, as user dvnapp.
09:41 pmauduit trying to boot my compo using iqss/glassfish-dataverse docker image instead of tweaking my own one
09:49 poikilotherm Beware, dataverse-glassfish image is intended to be used on OpenShift/Kubernetes
09:49 poikilotherm It uses init-containers
09:50 poikilotherm I am trying to get the pieces together in 5292 to have simpler and easier containers, which are usable with docker-compose, too
09:50 pmauduit ok
09:51 poikilotherm Any feedback from the community helps :-) Currently, all production instances run on the "classic installation mode", so recieving feedback that change is needed is very valuable to rise attention for this at IQSS ;-)
09:51 pmauduit payara server is another J2EE server ?
09:51 poikilotherm YEs
09:52 poikilotherm Payara is more or less a fork of Glassfish
09:52 poikilotherm And the codebase is currently very Glassfish specific
09:52 poikilotherm It will not be easy to change this to Wildfly
09:52 pmauduit I thought that wildfly was also a fork of glassfish
09:53 poikilotherm And as the ultimate ratio is to have both installation modes usable as long as not the majority of installations is based on containers, it is very likely to stay this way.
09:53 poikilotherm Nope
09:54 poikilotherm Wildfly is coming from JBoss, which is an independent implementation by RedHat
09:54 pmauduit ok
09:54 poikilotherm Glassfish has been owned by Oracle and is the Java EE app server _reference_ implementation
09:54 poikilotherm It is nowadays property of Eclipse Foundation
09:55 poikilotherm (Just as Java EE is moving to Eclipse as Jakarta EE)
09:55 pmauduit when I looked at glassfish docker images, it seemed a bit messy: I could find some blog posts from an official oracle blog, mentionning a docker-hub repo which now requires to be authenticated
09:55 poikilotherm Messy is a very polite word for the current situation.
09:55 pmauduit :)
09:56 poikilotherm This mess is due to the transition from Oracle to Eclipse.
09:56 poikilotherm Will take time to settle
09:56 pmauduit are a lot of people in the community running their dataverse instance on openshift, by the way ?
09:56 poikilotherm You could try to use Payara 4 images. But this is not officially tested or supported
09:57 poikilotherm Nope
09:57 poikilotherm At least - not yet.
09:57 poikilotherm There is interest, but none of the officially known 35 production installations run on Kubernetes/OpenShift
09:57 pmauduit I've seen the issues about it, and since the links between redhat, openshift and wildfly, I wanted to have a look at this J2EE server at the first sight
09:58 poikilotherm RedHat (or IBM nowadays) is active in a large number of places
09:58 poikilotherm IIRC some RedHat engineers had an internship at IQSS to make Dataverse Openshift ready, and conf/docker is the result from that.
09:59 poikilotherm pdurbin knows more about this
09:59 poikilotherm He pointed me to some issues about this some time ago, let me see if I can find them
10:00 pmauduit I also see that you're working in your fork with a docker maven plugin (from fabric8.io). We do use a similar plugin (the old one from spotify) on other projects
10:00 poikilotherm https://github.com/IQSS/dataverse/issues/4040
10:01 pmauduit yes I stumbled upon this one yesterday
10:03 poikilotherm Yeah, I want to build the docker images as a part of the dev process
10:03 poikilotherm This might lead someday into using CD
10:04 poikilotherm But that might be some AU away.
10:11 poikilotherm Oh BTW pmauduit: I am using Payara 5 for this. There is some ongoing discussion about using Payara or Eclipse Glassfish and the current setup guide and images use Glassfish 4, which is really EOL and patched manually. You might try with Payara 4, which is at least more maintained (but EOL IIRC as of 01/2019 if you have no subscription), but there seemed to be some issues. (when looking at the GH issues, just search for Payara... ;-)
10:16 pmauduit yes, I saw that your were taking some images on your account @ quay.io here: https://github.com/poikilotherm/dataverse/blob/5292-small-container/conf/docker/app/Dockerfile#L1
10:17 poikilotherm :-)
10:17 poikilotherm Those have their own issues... :-/
10:17 poikilotherm https://github.com/payara/Payara/issues/3506
10:18 poikilotherm Will be fixed in Payara 5.191
10:18 poikilotherm (v5, Q1 2019)
11:23 poikilotherm Hey pmauduit, are you https://github.com/pmauduit ?
12:15 poikilotherm Morning pdurbin :-)
12:16 pmauduit yes i am
12:29 poikilotherm pmauduit: Shall I add you the list of people in #5373?
12:30 pmauduit sure
12:30 pdurbin good morning, glad to see you two chatting :)
12:32 pmauduit yes, I'm pretty sure I read this here earlier
12:32 pmauduit grmbl sorry for the last message
12:32 pmauduit in fact I was backlogging ... :-P
12:45 pdurbin It's not quite that Red Hat interns were at IQSS. They were remote. I had a weekly call with them. I went over to the Red Hat office in Boston once to meet with them. And I went to their final presentation with their professors.
12:46 poikilotherm pdurbin: ok, thanks for the clarification. My memories where wrong ;-)
12:46 poikilotherm -h
12:48 pdurbin It would have been great to have them at standup every day.
12:49 pdurbin They did the best they could with a very challenging real world situation.
13:02 pmauduit poikilotherm: I was giving a try of the image generated from your branch (mvn clean package docker:build -Pcontainer), is there a way to configure the database access properties ?
13:06 pmauduit hmm I can override the init_2_configure.sh
13:07 poikilotherm For now you need to hack on it... I was planning for ENV variable usage, but for my tests, this wasn't needed so far.
13:07 poikilotherm Please keep in mind: this is a WIP!
13:08 poikilotherm It will NOT work as expected
13:09 poikilotherm Currently this is blocked by #5361 + #5344 and things like #5345, etc (pretty much every substory of #5292...)
13:09 pdurbin Are you *sure* you don't want your own git repo? :)
13:10 poikilotherm That would be a fork
13:10 poikilotherm I already have one ;-)
13:11 poikilotherm https://github.com/poikilotherm/dataverse/
13:11 pdurbin How can we unblock one of these substories?
13:12 poikilotherm Well #5345 is about the timers
13:12 poikilotherm That's in code review now thx to oyu
13:12 pdurbin yeah
13:12 pdurbin can we get any other substory moving?
13:12 poikilotherm 5344 needs work. I am looking into this, but need to look at Payara currently
13:13 pdurbin ok, so you are not blocked. you're working on Payara stuff
13:13 poikilotherm Yeah, but that is just a small side project being a base for my images... ;-)
13:13 poikilotherm Eager to be on 1.8u191 because of the container backports from 10+
13:14 pdurbin oh, it's small ok. what's next? what's after that?
13:15 pdurbin it's small, ok
13:16 poikilotherm https://github.com/payara/docker-payaraserver-full/issues/70
13:16 poikilotherm Just for the sake of completion
13:17 poikilotherm I think I should look into #5344
13:17 poikilotherm Made some housekeeping in #5373 earlier
13:17 poikilotherm And I need to fill in the gaps of my workplan for Danny ;-)
13:18 pdurbin I'd love to see your workplan.
13:20 poikilotherm Well I hope Danny will discuss this with you guys and improve it :-)
13:22 pdurbin me too
13:22 pdurbin this is my work plan: https://trello.com/b/uggFGv2H/work-http-wwwiqharvardedu-people-philip-durbin
13:23 poikilotherm Oh while seeing Netbeans 10 on it: did you notice that Java EE support was removed?
13:39 pdurbin Yeah. I'm confused about Netbeans 10. I don't think I can use it. I'm still on Netbeans 8.2. :/
13:40 pdurbin Did you see a blog post or something that explains the removal?
13:42 poikilotherm https://dzone.com/articles/notes-on-java-eejakarta-ee-support-for-netbeans-9
13:43 poikilotherm https://lists.apache.org/thread.html/6ba78ed2f1f761214db4f31022d953fc635dfe3fd6aa7933a728f969@%3Cdev.netbeans.apache.org%3E
13:52 pdurbin Thanks! Wow. Complicated. I think I'll stick with Netbeans 8.2 until Java EE (Jakarata EE) support "just works". I wonder if we should add a note to the dev guide about this.
13:54 poikilotherm You could use other editors... And there are people out there that installed the 8.2 plugin in 9.
13:54 poikilotherm Maybe this works fro 10, too
13:56 poikilotherm You could use IDEA... JetBrains offers free licenses to OSS projects. And Dataverse might be eligible for those licenses.
13:56 poikilotherm Ah damn... No chance
13:56 poikilotherm Your OS project may not offer paid sponsorship, or receive funding from commercial companies or organizations (NGO, education, research, or governmental).
13:58 poikilotherm But actually - what is Netbeans doing for you as a Java EE dev? Writing the code should still be possible. Running it from inside the IDE is nice, but for many stuff you need other approaches when external services are needed
14:00 pdurbin Matthew got Java EE stuff working in Netbeans 9. Sounds like a lot of work from what I'm seeing on that blog post. I like NetBeans. It's hard to list what it's doing for me. I like this plugin, for example: http://guides.dataverse.org/en/4.10/developers/tips.html#netbeans-connector-chrome-extension
14:05 poikilotherm Ah Spring Boot has support for LiveReload baked in out of the box...
14:05 poikilotherm I would really which for this in Java EE
14:05 poikilotherm wish
14:31 poikilotherm Maybe http://hotswapagent.org is there for the rescue
14:31 poikilotherm (JRebel is quite expensive...)
14:32 donsizemore joined #dataverse
14:41 pdurbin Have you tried the Netbeans Chrome extension? It gives you hot reloading.
14:42 poikilotherm Not using Netbeans nor Chrome... ;-)
14:42 poikilotherm Used Spring Boot with LiveReload and Browser Extensions before... ;-)
14:42 poikilotherm Independent from IDE
14:43 pdurbin Ok. I'm just saying Java EE already has this.
14:43 pmauduit using spring framework in a J2EE context is not possible ?
14:43 poikilotherm Nope
14:44 pdurbin Why not?
14:44 poikilotherm Different technology
14:44 poikilotherm There are some overlaps like CDI
14:44 pmauduit i thought that J2EE was a kind of superset above regular servlet webapps
14:44 poikilotherm Err... Short answer: kind of.
14:45 pdurbin I was just listening to airhacks 22 and they said it's possible to combine Java EE and Spring: http://airhacks.fm
14:45 poikilotherm Spring only works on the smaller set ;-)
14:45 pmauduit ok :)
14:45 poikilotherm Oh that's news to me :-
14:45 pdurbin me too
14:46 * poikilotherm is listening to https://s3.eu-central-1.amazonaws.com/airhacks.fm/airhacksfm_22.mp3
14:49 pdurbin poikilotherm: you can fast forward to 1:00:34
14:49 poikilotherm Well you could of course incorporate the Java EE JARs in Spring. But this seems like a really hacky way...
14:49 poikilotherm Thx
14:50 poikilotherm https://blog.eisele.net/2016/04/integration-architecture-with-java-ee-and-spring.html
14:50 poikilotherm Like I said - incorporate EE in the Fat JAR and go use it.
14:50 pdurbin In the podcast they talk about the opposite. Start with Java EE standards. Add Spring as needed.
14:52 poikilotherm Ok... So I'll update my answer above: you can combine it, but it depends on what to achieve. Using Spring Boot with Java EE seems no good idea nowadays with Microprofile and mini app servers and other UberJAR approaches around.
14:53 pdurbin sure, I agree with that
14:54 pdurbin poikilotherm: did you already send you work plan to Danny? I have my one on one with him in 5 minutes.
14:54 poikilotherm Nope
14:54 pdurbin ok
14:54 pdurbin are you coming?
14:54 poikilotherm I'll hurry :-D
14:54 poikilotherm Depends on the work plan :-D
14:54 pdurbin ok :)
14:54 poikilotherm Need that first to show my people if it makes sense
14:55 pdurbin we don't want to waste your time
14:56 poikilotherm Hmm my edit rights from the agenda are gone. Just requested them again, but maybe you can give Danny a hint
14:56 poikilotherm ,too
15:00 poikilotherm Ok, he already fixed it :-D
15:14 Jim79 joined #dataverse
15:15 Jim79 Anyone here looking at the publishing issue (in emails today)?
15:18 pameyer joined #dataverse
15:21 pameyer Jim79 : as of last night, folks were definately looking at it
15:24 Jim79 me too - on a branch with minor qdr updates from develop, I see one issue is a call to DataCite to check a null identitier which returns 'true' for the identifier existing, leading to a loop with no log message until things timeout...
15:25 pameyer "null identifier"?
15:26 pameyer is that DataCite returning that a non-existant identifier exists?
15:26 Jim79 the alreadyexists() method gets called for a null/zero length string for some file(s)...
15:27 Jim79 Dataverse sends .../doi/<nothing> to DataCite which returns a 200 response as though nothing were some 10.5072/FK2... value and it was found...
15:28 pameyer that's good info that I don't think has been in the debugging discussions before
15:28 Jim79 I don't know why we're sending it, or if the change is something like DataCite switching from sending an error when there's no doi in the URL to sending a 200 with a list
15:28 pameyer would lead me to believe that this would only be effecting installations w\ file DOIs enabled
15:28 Jim79 could be
15:29 pameyer I don't know why it would be sent either
15:29 Jim79 I'll keep digging but if others catch the root cause...
15:30 pameyer don't remember what the MDS docs specify for error codes though
15:30 pameyer have you seen the problem being intermittent?
15:32 Jim79 not sure about intermittent
15:33 donsizemore @pameyer Jon and I share your suspicion. UNC disabled file DOIs and we aren't aware of any problems
15:33 Jim79 Back in december there were definitely days where mds was not working - giving 502 timeouts, etc. which could have overlapped with this issue
15:35 pameyer @donsizemore good to hear your *not* having problems
15:35 pameyer @Jim79 do you remember if those were outside the outage window datacite had in december?
15:37 Jim79 outside their planned one - I saw issues into the next week
15:38 Jim79 definitely intermeittent for those - I'd see one or two files get IDs and then the rest fail.
15:38 pameyer consistent with there being multiple problems at the same time
15:39 pameyer do you know if qdr is using dependent identifiers for file dois?
15:39 pameyer wondering about a failure mode where the dataset doi fails/glitches and tries to generate a dependent file doi, but fails because there's nothing to depend on
15:42 Jim79 we do use dependent dois. BUt, since the dataset DOI is created early, I don't think it not existing is the answer for us.
15:42 pameyer great - rules that out.
15:43 pdurbin Jim79: thanks for looking into this. Is either production installation you work on affected?
15:43 Jim79 too early to rule anything out
15:43 Jim79 I saw some issues on our v4.9.4 test machine - not the same since I see some log messages there (versus the infinite loop discussed above)
15:44 Jim79 I don't know that anyone has published on prod since we saw the DataCite maintenance/follow on trouble
15:45 pameyer I was talking to landrev last night, and some of the things he mentioned seeing sounded like there might be database locking issues
15:46 Jim79 look at the 4.9.4 log, I was still seeing some 502 errors from DataCite on 1-3-2019...
15:47 pdurbin Ok. Until this morning I thought only Harvard Dataverse was having trouble publishing in production. Now we know Thanh Thanh is having trouble in production too: https://groups.google.com/d/msg/dataverse-community/WJ6sTgKfNI4/qpzd0KIeDwAJ
15:48 pameyer https://status.datacite.org/ doesn't think they had downtime 1-3-2019
15:49 pameyer not disagreeing with the logs; just additional info.
15:49 Jim79 happily sending 502 responses :-)
15:50 Jim79 yeah - I'm confused still - perhaps it sends 502 incorrectly if it gets bad input or ...
15:52 pameyer I don't have a great idea where the failure(s) are either...
15:54 pdurbin My take on the "Update on DataCite Service Status" email from December 19th is that DOI registration should have been unaffected by DataCite's recent upgrade/outage: https://listserv.ucop.edu/cgi-bin/wa.exe?A3=ind1812C&L=EZID-L&E=base64&P=5644&B=--_000_CY4PR06MB352549BCCE4DFD3E3505095B90BE0CY4PR06MB3525namp_&T=text%2Fhtml;%20charset=utf-8&XSS=3&header=1
15:55 pdurbin "Some users reported issues with DOI registration. These are unrelated to the Solr outage. Please reach out to DataCite support via support@datacite.org so that we can resolve these issues."
15:55 pmauduit oh another question (more out of curiosity): since we do JPA in dataverse, are there any strong adherence to postgresql or we could use whatever existing DBMS ? (postgres is definitely fine for me, it's just curiosity)
15:56 pameyer @pmauduit I don't know the details, but I think there's some code that assumes postgres
15:56 pmauduit ok
15:58 pameyer @Jim79 in your troubleshooting, have you tried vacuuming any of the tables?
16:01 jri joined #dataverse
16:01 Jim79 no
16:02 pameyer thanks
16:03 pdurbin pmauduit: I'm pretty sure the sequence at http://guides.dataverse.org/en/4.10/installation/config.html#identifiergenerationstyle is postgres-specific and I believe other parts of the code assume postgres.
16:05 pdurbin every now and then we'll try to remove postgres-specific stuff (e.g. https://github.com/IQSS/dataverse/commit/b4bf254 ) but right now we require postgres
16:07 Jim79 FWIW - a hack to just not call DataCite to check if a null/empty DOI exists allows me to publish on a ~develop branch. Looking into what changed...
16:09 pdurbin Jim79: thanks. I'm confused. Did the Dataverse code change? Are older versions of Dataverse (where file PIDs are supported)? Or did something change on DataCite's end after their upgrade/outage?
16:09 Jim79 Looks like that issue only affects files w/o a DOI, i.e. republishing a Dataset with updated metadata but no new files doesn't cause the trouble.
16:09 pdurbin whoops, I meant to say "are older versions of Dataverse affected?"
16:10 Jim79 Not sure - I think there is still some problem with v4.9.4 - not sure if it is the same or not
16:10 pdurbin ok
16:10 pdurbin I'm just wondering what changed: Dataverse or DataCite
16:10 pmauduit ok thanks for the info
16:10 pameyer @pdurbin: I'm leaning towards both
16:11 pdurbin heh
16:11 pameyer well, Jim79 seeing 502 from datacite when they don't show downtime suggests pretty strongly something going on there
16:12 pameyer but the problem seeming to be limitied to installations using file dois suggests something in the dataverse code too
16:12 pdurbin As a workaround, what about setting :FilePIDsEnabled to false for a while?
16:12 Jim79 not sure - if DataCite changed what it sent in response to a call to find no DOI (i.e. it now sends a list) and v.4.9.4 and 4.10 both sent that but DV changed to keep retrying, it could explain things. Definitely can't say that's true yet
16:13 pdurbin Jim79: do you feel confident enough in the symptoms to open an issue?
16:13 Jim79 sounds like it might work...
16:18 Jim79 um - at this point we could have an issue that there's a publication problem, perhaps isolated to the case where file DOIs need to be assigned (file DOIs on and new files that need DOIs assigned)...
16:29 Jim79 it looks like https://github.com/IQSS/dataverse/blame/develop/src/main/java/edu/harvard/iq/dataverse/DataFileServiceBean.java#L1570, which hasn't been changed in a year, would have always been sending a null identifier for new files into the alreadyExists methods
16:35 Jim79 which seems like a DV bug (not really checking the new identifier) that has been exposed by a change at DataCite?
16:47 Jim79 OK - I think this is clear enough I'll create an issue and propose a fix (in a PR) to refactor the GlobalIdService to check alreadyExists(GlobalId) rather than alreadyExists(DVObject) so one can cleanly test new identifiers
16:48 Jim79 If that's going to duplicate other work, let me know...
17:01 Jim79 https://github.com/IQSS/dataverse/issues/5427
17:03 pameyer Jim79 that looks plausible to me
17:03 pdurbin Jim79: thanks! I just walked down the hall to talk to Leonid and it sound like he's going to pop in here soon to coordinate with you.
17:03 Jim79 Thanks...
17:09 pdurbin Jim79: https://github.com/IQSS/dataverse/issues/4602 is unrelated, right?
17:11 Jim79 related I think - null is sent during the check of newly minted IDs when it shouldn't
17:12 Jim79 looks like handles also have been able to ignore the first check (DV gets to assume that the id is unique) as DataCite was...
17:13 Jim79 the fact that there are two times its being checked is a ~separable issue, but I think really checking the new id when you mint/first assign should be fixed/is the one to keep.
17:14 pdurbin Ok. I left a comment indicating that it might be related. Thanks.
17:28 leonid joined #dataverse
17:29 leonid Hi everybody;
17:31 * pdurbin waves at leonid
17:31 leonid I spent some (very frustrating) time yesterday investigating this stuff too; I also realized we were doing something improper with looking up file level DOIs, and possibly sending nulls.
17:31 leonid specifically, our production logs had a bunch of exception stack traces like this:
17:32 leonid java.lang.RuntimeException: IOException when get metadata         at edu.harvard.iq.dataverse.DataCiteRESTfullClient​.testDOIExists(DataCiteRESTfullClient.java:162)         at edu.harvard.iq.dataverse.DOIDataCi​teRegisterService.testDOIExists(DO​IDataCiteRegisterService.java:226)
17:33 leonid the ioexception is from a fairly straightforward point in the code where it makes a call to https://mds.datacite.org/metadata/... for the id in question
17:33 Jim79 yeah - I think the exception occurs when DataCite gives a 502 response
17:34 leonid (and it works on the command line just fine; both for existing ids, and for ones not yet registered - with a 404).
17:35 Jim79 (and guessing that either getting lots of pings from the ~infinite loop, or just responding on 16M DOIs instead of checking for 1 causes a 502)
17:35 leonid also, we are running 4.9.4 still - so that's before Jim's latest changes to DataCiteRESTfullClient.java, etc. were incorporated
17:36 Jim79 yeah - the code sending a null hasn't changed for 6/12 months, so there has to have been something at DataCite recently
17:36 leonid so for a moment i assumed it was simply a matter of accumuluting, not-properly-closed http client connections that were causing the client to start barfing (a false lead, as i know by now)_
17:37 leonid what further frustrated me was when i realized that there was no 1:1 corelation between these stack traces and the actual inability to publish a dataset
17:37 Jim79 ?? - some where it just hung?
17:38 Jim79 or exceptions when it still succeeded?
17:38 leonid i had a list of datasets that users specifically complained about not being able to publish; and when trying to publish those, there was NOTHING in the logs, no stack traces, and hte process just hung...
17:39 leonid at that point i was too tired and i left, planning to actually send it through a debugger in the morning
17:39 leonid and this is what you must have done - and that's how you discovered the infinite loop, right?
17:39 Jim79 yeah - when datacite kept up and didn't 502, the loop just keeps going
17:39 Jim79 yeah - I added some log messages
17:40 pdurbin that's been my experience... with a file in the dataset, publishing just hangs with nothing in server.log. this is on dev boxes close to develop configured with datacite
17:40 leonid so yeah, this part is still an unknown - it looks like there's an exception in some cases, and an infinite loop in others...
17:40 Jim79 I think it makes sense - it's just whether DataCite keeps up with the barrage or not
17:41 madunlap joined #dataverse
17:41 pameyer I'm not sure if this explains the deploy failures or not
17:41 leonid oh, so it's not an infinite loop literally - it really is a timeout on the dataset part?
17:41 Jim79 IN any case - I'm doing a quick test on my branch and will then push a PR on develop
17:42 leonid this would make some sense; when you send their /metadata api a null it looks like it tries to look up EVERYTHING in their database, lol.
17:42 leonid ok, great, thanks a lot!
17:42 Jim79 well, DV keeps generating new IDs since DataCite keeps saying they exist (by now responding with a 200 to /metadata/null) unless/until it 502s.
17:42 Jim79 I think something else timeouts/breaks after 5-10 minutes if DataCite keeps up and that stops the loop as well
17:43 leonid i basically came here to acknowledge that we've been researching this too and to thank you for the incoming pr
17:43 leonid oh, yes, of course... sorry to be slow.
17:43 Jim79 glad to help and appreciate being able to compare notes
17:44 leonid i didn't get the part about interpreting the 200 as "already exists" right away
17:44 Jim79 always nasty when two issues collide...
17:44 Jim79 we should thank DataCite because I think they had to have changed the response to 200 in the recent update
17:45 pameyer and let datacite know about the null query -> return everything behavior.  pretty sure that's something they'd want turned off
17:47 pdurbin pameyer: would we report it at https://github.com/datacite/poodle ?
17:47 Jim79 not sure what could/should be done about prior DV versions - this affects all file DOIs (so 4.9.x) and unless DataCite reverts, I think it requires a code update to fix (could do the suggestion to turn off file DOIs until upgrading to 4.10.x...)
17:49 pameyer pdurbin: that looks like the repo; but not sure if it should go there or to their support email
17:49 leonid pameyer: the deploy failure is an entirely different thing yes; (last night I wasn't able to deploy on one of the 2 production nodes, because the app on the first node kept some kind of a lock on the dvobject table in the db??? - on of the guesses is that it was possibly a result of an attempt to publish gone wrong on the first node... but who knows)
17:49 leonid Yes, I agree, that it is prudent to advise everybody to disable file level DOIs until the next version.
17:50 leonid (this is what i have just done in our production)
17:50 Jim79 - I had some deploy failures (different mechanism here) and suspected that the infinite loop was still going...
17:50 pameyer @leonid ok.  This fix makes sense to me; but there may be other problems lurking
17:52 leonid for the record, i never liked the idea of file level dois in the first place. not as the default behavior, definitely. (and when it was first added, there was no way to even opt out of it at all).
17:54 leonid @pameyer the "fix" - you mean, the workaround of disabling the registration for file-level DOIs? that would be a temporary thing; and we would provide a script for re-registering them later, for those who want them.
17:54 leonid or are you talking about @Jim79's fix?
17:54 donsizemore joined #dataverse
17:54 pameyer @leonid - I was talking about @Jim79's fix (not sending null to datacite)
17:56 leonid well, we shouldn't be sending that null to datacite... so it is a good fix. :)
17:56 pameyer and thinking about it some more, it does seem plausible to me that it was the cause of the deploy problems
17:56 leonid and to be clear, we are sending a string "null"; not a real null or a zero byte, or anything like that.
17:57 pameyer infinite loop of alter/update/insert/select in a different transaction from a transaction trying to do create table / create index
17:57 leonid one would think they should just return a 404; (and that's presumably what they were doing before)
17:57 pameyer ah - I'd been thinking that sending "null" resulted in "metadata on all datacite dois", which would probably be suboptimal from their perspective
17:58 leonid yes, sure, there are ways to be more careful there. we'll open a dev ticket for further improvememnts and take it from there
17:58 leonid we are literally sending this: https://mds.datacite.org/metadata/null
17:59 leonid and the response to it is now identical to https://mds.datacite.org/metadata/xxx
17:59 leonid and/or to just https://mds.datacite.org/metadata/
18:01 leonid ... and, if nothing else, we can do more than just check if the return code is 200. we can (should) probably read the actual response and validate it...
18:04 pdurbin Well, we'll use the incoming pull request from Jim79 as a starting point, right? And decide how much technical debt we want to pay down?
18:07 leonid I feel we should accept Jim's pr as is - since it does fix a pretty serious issue. then open another issue for further improvements/"technical debt"; and make sure it doesn't fall through the cracks.
18:17 Jim79 OK - https://github.com/IQSS/dataverse/pull/5428 exists. It's editable by maintainers, so feel free to change as needed...
18:19 Jim79 (fyi - I haven't tested on develop directly, just the modified develop at QDR...)
18:22 Jim15 joined #dataverse
18:23 Jim97 joined #dataverse
18:24 Jim97 I agree that this PR doesn't really fix things in that, while it now sends the identifier in question, the fact that DataCite will give a 200 response for anything means duplicates won't be detected...
18:27 Jim97 just noticed that, despite the application/xml header I send in curl, DataCite responds with json when it can't find the id. It's only xml when the id is found...
18:28 leonid wait, i don't think DataCite really gives 200 in response to anything... if I give it something like 10.7910/DVN/MADEUPNONEXISTINGID i get back an honest 404
18:28 leonid it's only when i give it a string without slashes (like "null" or "xxx") that I get that "everything" response
18:29 leonid yeah, I noticed that too - that it's all json, despite the "accept: xml" header
18:31 leonid so as long as we are not sending it total junk, but actual ids expected in your dataverse installation - with a sensible name space, etc. - i think it should be working as it should; properly detecting duplicates.
18:33 Jim97 Ahh - good.
18:35 pdurbin Jim97: did you see the compilation failure with FakePidProviderServiceBean?
18:36 Jim97 Nope... one sec...
18:39 Jim97 try that - should the new method also return true like the alreadyExists(dvObject) call?
18:43 pdurbin Jim97: yep. Thanks! Now it compiles. Green instead of red at https://travis-ci.org/IQSS/dataverse/pull_requests
18:44 Jim97 my qdr dev branch is just older than when you added the Fake Pid provider...
18:44 pdurbin yeah, that thing is brand new
18:48 Jim92 joined #dataverse
18:48 Jim_ joined #dataverse
18:49 Jim_ How do I reference leonid in github - @<what>?
18:50 leonid @pdurbin do you know/remember where the DataCite code came from? Gustavo mentioned it was written externally, then passed to us... Most lines in there are checked in by steve, but the classes list @luopc as the author... who is that?
18:50 leonid @landreev
18:51 Jim_ thanks
18:51 isullivan joined #dataverse
18:59 pdurbin leonid: Pengcheng Luo from Peking. Did you meet him at last year's community meeting? I had lunch with him. He gave a talk called "Support University Students’ Data Driven Research in a National Contest with PKU Open Research Data Platform": https://dataversecommunitymeeting20.sched.com/speaker/pengchengluo
18:59 leonid ok, i know who he is, yes. did not immediately connect the github name to the person - thanks.
19:00 pdurbin really nice guy
19:00 leonid yep
19:05 Jim_ Should we note something for the community list or wait until the fix is available?
19:07 pdurbin Good question. It's a free country. Please feel free to send something. :)
19:08 Jim_ (responding to earlier discussion) - the /api/admin/registerDataFileAll method should be all that's needed to catch up if people turn off file pids for now, so the workaround shouldn't cause much extra work.
19:10 Jim_ Yeah, but one authoritative note that there's a 4.10.x fix and a 4.9.x workaround might be better (and I'm not authoritative...)
19:11 pdurbin am I? :)
19:12 Jim_ to me you are :-)
19:12 * pdurbin blushes
19:13 jri joined #dataverse
19:14 pdurbin Danny's most recent post on the thread says "I'll update this as we uncover more information." So I guess he's planning on sending something.
19:14 Jim_ OK - let me send some text to Danny and let him be authoritative...
19:14 pdurbin sounds good
19:15 pdurbin I would probably just link to the IRC log, which isn't very useful. Lots of noise. :)
20:59 leonid fyi, I spoke to Danny about notifying other installations; I wrote a blurb and he's going to send it out.
20:59 pdurbin awesome
21:04 pdurbin I see Jim_ sent a nice note to the google group. Thanks!
21:06 Jim_ I emailed DataCite support as well - partly to note that this might have been causing 502s and suggesting they respect accept headers
21:09 pdurbin cool, please keep us posted on what they say
21:11 pdurbin I wonder if the 502 is from a timeout: https://serverfault.com/questions/475866/sporatic-502-errors-with-ruby-on-rails-site-running-on-nginx
21:18 Guest43965 definitely - with enough load, DataCite doesn't reply to their gateway in time. FWIW - this is what happens with Dataverse and large files - if it takes too long for Dataverse to store (and perhaps unzip/ingest), the web app gets a 502 from apache
21:18 pdurbin ok, interesting
21:21 pameyer I'm not too farmiliar with rails, but some other frameworks are configured to email folks whenever there's an exception in the app
21:21 pameyer hopefully wasn't the case here
21:30 leonid Thanks for contacting DataCite; I was going to ask if that was already done.
21:31 leonid i'm not sure if their ignoring of the "accept: xml" header is an issue of its own; or just a side effect of their defaulting to "show all" when a junk query is received...
21:32 leonid i feel like the latter is the main bug; and as long as they fix it, the other may be irrelevant... but who knows - definitely a good idea to pass everything we know to them.
21:32 pameyer that was on the metadata endpoint, right?
21:33 leonid correct.
21:34 leonid .../metadata/10.7910/DVN/ZZZZZZZ returns 404, as it should (assumign ZZZZZZZ does not exist)
21:36 leonid but, for some reason, .../metadata/ZZZZZZZ is now resulting in what seems like a [paginated, thank god] list of everything they have; always in json.
21:45 pameyer there's a lot of magic lurking there; but it does appear that the xml only gets rendered on a 200 status
21:45 pameyer but somewhat oddly, it looks to me like it should be doing text/plain as a default
21:46 pameyer appears to be downstream of the code in the poodle repo pdurbin mentioned

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.