IQSS logo

IRC log for #dataverse, 2019-03-29

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
13:21 donsizemore joined #dataverse
13:24 donsizemore @pdurbin knock knock?
14:00 donsizemore @pdurbin well, when you're around https://dataversemetrics.odum.unc.edu/dataverse-metrics/ is up with everybody except data.inra.fr who is throwing a 503
14:00 donsizemore @pdurbin how often would you like numbers updated? (daily, or more often?)
14:14 pdurbin donsizemore: thanks but I'm getting "Firefox can’t establish a connection to the server at dataversemetrics.odum.unc.edu"
14:15 pdurbin daily is plenty often enough, thanks!
14:26 pdurbin I'm pretty sure the problem is that port 443 isn't open.
14:27 pdurbin curl: (7) Failed to connect to dataversemetrics.odum.unc.edu port 443: Connection refused
14:34 pameyer joined #dataverse
14:37 donsizemore joined #dataverse
14:39 donsizemore @pdurbin whoops! https was open to odum only. try it now?
14:41 pdurbin donsizemore: looking good! Thanks! I'll go make the iframe url switch I mentioned over at https://github.com/IQSS/dataverse-metrics/issues/7#issuecomment-477753036
14:41 donsizemore @pdurbin i'm not worried about branding (odum is in the URL) but jon may want a logo
14:43 pdurbin as you wish
15:01 pdurbin done. looks and acts a little weird on mobile
15:26 sivoais joined #dataverse
15:29 donsizemore joined #dataverse
15:30 donsizemore @pdurbin data.inra.fr is back in the config file; they were just breaking the initial data collection
15:34 donsizemore @pdurbin speaking of, if a given dataverse is broken, will that leave the metrics page empty or does the cache remain in place until another runs completes successfully
15:34 pdurbin donsizemore: cool, Danny, Derek and I talked about it after standup. The "after" (please see screenshots in the issue) is a little "squishy" so I'm going to much around in openscholar and see if I can force it to be a single column instead of two columns.
15:36 donsizemore @pdurbin they look fine for me in firefox, are you using internet explorer?
15:36 pdurbin I'm not exactly sure what happens if a given installation is down. metrics.py is just a wrapper to a "download" and "aggregate" script. I wouldn't be surprised if the download script blows up if any installation is down, but I'm not sure. The aggregate script operates on JSON files that have already been downloaded.
15:36 donsizemore @pdurbin metrics.py dies and reports the 503
15:36 donsizemore @pdurbin or rather download.py
15:36 pdurbin Nope. Firefox on desktop. Chrome on Android. Do you see one column or two?
15:37 donsizemore i was joking about IE ;) I see two columns and each chart is nearly square
15:38 donsizemore the daily cronjob is in place BTW, just let me know when you want me to update the code
15:38 pdurbin awesome on the cron job. can you please throw a screenshot in the issue?
15:40 donsizemore @pdurbin i get the squishy in safari on ios, in desktop browsers everything looks great
15:41 pdurbin ok, thanks
15:54 donsizemore @pdurbin is there anything i can do to allow you to use us as an iframe?
15:55 pdurbin I don't know yet. I'm going to ask how whitelisting works. But thanks. :)
16:00 pdurbin Danny just emailed the powers that be to ask if unc.edu can be whitelisted.
16:02 pdurbin it doesn't like dataverse.org either :(
16:11 pdurbin ok, unc.edu has been whitelisted
16:11 pdurbin that was fast
16:15 bricas_ 'ello all.
16:15 pdurbin what's up bricas_ ... been a while :)
16:16 bricas_ indeed. i apologize as it seems i'm only ever here when i need something :)
16:17 bricas_ we're having an issue moving a a DRAFT to published. we get the blue spinner on the screen but that's it.
16:18 bricas_ the version adds about 5G of data over 4 files -- i wonder if that's part of the problem
16:18 bricas_ unfortunately i don't see any errors
16:20 pdurbin no errors in server.log?
16:20 bricas_ let me try again and tail that file
16:21 pdurbin ok
16:24 bricas_ nuffin.
16:26 pdurbin What are you using for this setting? http://guides.dataverse.org/en/4.11/installation/config.html#filepidsenabled
16:28 bricas_ oof. that blew up my dataverse, i guess! Service Unavailable
16:28 bricas_ restarting glassfish....
16:29 pdurbin sadness T-T
16:30 pameyer joined #dataverse
16:31 pdurbin Is this your "MNKR4E" dataset?
16:34 bricas_ ya
16:34 pdurbin I'm afraid it's in a bad way.
16:35 bricas_ oh good.
16:35 bricas_ :P
16:35 pdurbin Those four files have been indexed into Solr but when I click I'm prompted to log in.
16:35 pdurbin So you might want to try fixing this by reindexing that dataset.
16:36 pdurbin like this: https://github.com/IQSS/dataverse/blob/2180ac1d847a9110cbbafd076b19761787f90bed/doc/sphinx-guides/source/admin/solr-search-index.rst#L78
16:36 pdurbin not that this will help with your publishing problem
16:38 bricas_ amazingly, curl is now broken on this machine.
16:38 bricas_ suffering from this since we upgraded to ubuntu 18.04: https://bugs.launchpad.net/ubuntu/+source/xmltooling/+bug/1776489
16:39 pdurbin hit it with elinks instead :)
16:39 pdurbin or wget
16:39 pdurbin I'm just impressed you got Dataverse to work on Ubuntu. :)
16:40 bricas_ where is that id from at the end of the datasets url?
16:41 pdurbin looks like it's 211 based on https://dataverse.lib.unb.ca/api/datasets/export?exporter=dataverse_json&persistentId=doi%3A10.25545/MNKR4E
16:42 pdurbin which is a link (JSON) from "Export Metadata"
16:43 pdurbin you have a nice footer. lots going on in there
16:43 bricas_ http://localhost:8080/api/admin/settings/:FilePIDsEnabled
16:43 bricas_ ^-- 404
16:44 bricas_ we'
16:44 pdurbin ok, so unset
16:44 bricas_ err
16:44 pdurbin probably this: {"status":"ERROR","message":"Setting :FilePIDsEnabled not found"}
16:45 bricas_ we're liking the new theming options -- we have a github that autodeploys changes to our header & footer when updated
16:45 bricas_ *github repo
16:45 pdurbin nice
16:46 pdurbin your most recently published file doesn't have a doi... from june 2018
16:49 pdurbin file PIDs were introduced in Dataverse 4.9
16:49 pdurbin so I'm sort of wondering if you have published any files since after you upgraded to 4.9
16:51 bricas_ probably not.
16:51 pdurbin Do you want your files to have PIDs?
16:53 bricas_ if it fixes my issue. yes. :)
16:53 pdurbin well
16:54 pdurbin what I'm saying is that after Dataverse 4.9, the default is for files to get PIDs on publish
16:55 bricas_ so perhaps we need to put that to false?
16:55 pdurbin well, it's up to you
16:56 pdurbin I'm just saying that it's something that changed since you last published a file. From what I can tell.
16:57 pameyer hmm.... one of the things that a bunch of folks might've changed in that time window would be doi provider
16:57 pdurbin ah, sure
16:57 pdurbin since EZID is shutting down
16:57 pameyer but that would probably show up in server.log
16:57 pdurbin or has shut down
16:57 pameyer limited to ca
16:57 pdurbin right, right, ca
16:57 bricas_ CA!
16:58 bricas_ i've set it to false. will try publishing again.
16:58 pdurbin oh, the other ca... california
16:58 bricas_ oh. dang.
16:59 pdurbin pameyer: that reminds me, are you going to talk about the dcm at the dcm? ;)
17:00 pameyer pdurbin: probably not - I've talked about it at the past 2 dvcm's ;)
17:00 pdurbin "we know, we know"
17:00 pdurbin "get the hook"
17:00 pdurbin "bang the gong"
17:01 bricas_ no diff.
17:02 pdurbin Bummer. Did attempting to publish that dataset kill Glassfish again?
17:04 bricas_ yes
17:04 pdurbin double sadness T-T
17:06 pdurbin bricas_: please email support@dataverse.org to open a ticket.
17:07 bricas_ okay. it's back and i've re-indexed the dataset again
17:08 pdurbin if you could let us know the ticket number that would be helpful
17:08 pdurbin I mean, to me it's a legit github issue too. "Attempting to publish a dataset crashes Glassfish"
17:09 pdurbin maybe you could try to replicate it on https://demo.dataverse.org
17:09 pdurbin see if you can crash glassfish there
17:10 pdurbin oh but we may have a smaller upload limit than you do
17:10 pdurbin but once the file has been uploaded, it shouldn't matter how big it is when you publish
17:11 pdurbin I'm running out of ideas.
17:13 pameyer I don't think anything would try to generate exports for a data file at publish time
17:13 bricas_ ticket 274118
17:13 pameyer file size shouldn't matter for publish; but if glassfish is dying because of memory issues it might be worth mocking those files, publishing, and putting them back
17:14 pdurbin bricas_: great writeup. Thanks.
17:14 bricas_ it's a start. i can provide more info as needed.
17:15 pdurbin pameyer: bleh but sure, I guess. What eats up memory at publish? bricas_ already turned of file PIDs and are only 4 files anyway.
17:15 pdurbin Having file PIDs on with thousands of files in a dataset is usually not a good time when you go to publish.
17:16 bricas_ so, FYI, the files are .7z files
17:16 pdurbin .7z is ok. probably just treated like a binary
17:16 pameyer thousands of files anyway wouldn't be great - but 4 files shouldn't be an issue
17:17 pameyer for that matter, glassfish sometimes dies without memory issues; so I wouldn't get set on that
17:18 pameyer I'm inferring that this is a prod box w\ single glassfish server - but if this is wrong, the simplest approach might just be to hook up the debugger to glassfish; walk through publish, and see where it dies
17:23 bricas_ i'm happy to try whatever :)
18:13 bricas_ i'm going to try to publish a dummy dataset and see what happens
18:19 pameyer bricas_ were there any signs of high memory usage or cpu spikes during publishing?
18:19 pdurbin how is the dummy dataset different?
18:20 pdurbin I assume you know you can destroy it afterward.
18:28 bricas_ there is a slight cpu spike
18:28 bricas_ i thought i would test with a small file -- just to rule out the large file issue
18:31 pdurbin and it worked? published?
18:32 bricas_ (sec)
18:34 bricas_ oh hrmph. bad news all around:
18:34 bricas_ as an aside i was adding someone as an admin to another dataverse -- when i click save changes, it just spins
18:34 bricas_ i also just get the blue spinner when trying to just create a new dataset
18:35 bricas_ oh, it did actually create it though
18:35 bricas_ except, "there are no files in this dataset"
18:36 bricas_ i just uploaded a 5-byte text file
18:36 pameyer just to rule it out - there's free space left on the storage, right?
18:37 bricas_ 37TB
18:38 pameyer that does rule it out ;)
18:38 pdurbin gremlins
18:38 pdurbin you're running stock 4.11, right? not a fork
18:38 bricas_ also, even though i got the blue spinner when adding an admin, it did actually work.
18:39 bricas_ yes, pure 4.11
18:39 pdurbin pure as a mountain stream
18:39 pdurbin buggy as camping by a mountain
18:40 pdurbin can you replicate any of this oddness on https://demo.dataverse.org ?
18:44 donsizemore joined #dataverse
18:44 bricas_ pdurbin: no. it worked fine.
18:44 pameyer pdurbin: slightly tangential, but do you know if IQSS/dataverse-client-python ever worked w\ tokens?
18:45 pameyer bricas_ another thing you've probably checked, but does the api behave the same way for your installation?
18:47 pdurbin pameyer: because dataverse-client-python was originally written when the only API available (SWORD) required an API token, it wouldn't surprise me if it's required. But the real answer is that I don't know.
18:48 pameyer pdurbin: thanks.  from a quick look at the code, it looked like it might be trying to use the api token as a password for http auth; rather than an x-dataverse-token header
18:48 pdurbin I don't know where to take a sick glassfish. There's #glassfish but they'll probably tell you it's a #dataverse problem. And it probably is.
18:49 pdurbin That's how SWORD works. HTTP Auth. It's kinda weird, I guess, our implementation.
18:49 pameyer sick glassfish is ubuntu, right?
18:50 donsizemore sick glassfish is bluefish
18:50 pameyer ;)
18:51 pdurbin donsizemore: help us!
18:53 donsizemore finishing up sexual harrassment training (no joke), gimme one minute?
18:56 pdurbin no rush, I just hope you have some ideas for bricas_ 'cause I'm fresh out
18:56 donsizemore okay. eek, curl is broken? you're on ubuntu?
18:56 pdurbin it's worse than curl being broken
18:57 pdurbin when bricas_ tries to publish a certain dataset it kills glassfish
18:57 donsizemore @bricas_ i'm sure you've been asked above, but there's a lot of text. is anything written to server.log (or to /var/log/syslog)
18:57 bricas_ OMG
18:58 bricas_ is this a MAIL issue
18:58 pdurbin !
18:58 donsizemore dpkg-reconfigure exim4? ;)
18:59 bricas_ we use postfix for relaying to our campus smtp and i think the recent upgrade went ... poorly.
18:59 pameyer I got bit by something similar a week or so back
19:00 bricas_ BINGO
19:00 bricas_ Success! – This dataset has been published.
19:01 donsizemore #noemil
19:01 donsizemore ^#noemail
19:01 pameyer it's nice when things work ;)
19:02 pdurbin Great but it's still a bug in Dataverse, right? What's the expected behavior?
19:05 pameyer that's probably worth discussing; but I'd vote for "if sending email is a required part of $action, and it can't email - then fail $action and report the error was due to no email"
19:08 pdurbin Ok, so bubble the error up to the user and say something like, "Sorry, you can't publish right now because an email server is down."
19:09 donsizemore @pdurbin or how about "we published your dataset, but were unable to send a follow-up message"
19:09 pdurbin sure, I'll let you two arm wrestle over it
19:10 donsizemore #noemail
19:13 pameyer my comment-sign-fu is weak ;)
19:14 bricas_ and on that note, i'm off. :)
19:14 pameyer holding up publishing because no email does seem like it might be counter-intuitive to users
19:14 bricas_ cheers and thanks for the help!
19:14 pameyer bricas_ glad you got it working again
19:14 pdurbin have a good weekend, bricas_
19:20 donsizemore @pdurbin should I open an issue over bricas' e-mail experience?
19:21 pdurbin you're welcome to
19:21 pdurbin I'm trying to get out of the business of opening issues. I make exceptions for issues I actually think we'll fix in a timely manner.
19:21 donsizemore @pdurbin i'm invested in the issue because UNC's campus relay isn't always the most reliable
19:22 pdurbin How hard would it be for you to reproduce the scenario?
19:22 donsizemore gimme a sec
19:27 donsizemore @pdurbin good call. stopped postfix on a test 4.9.4 box and can still publish
19:29 donsizemore FinalizeDatasetPublicationCommand failed but the publish part succeeded.
19:30 pameyer how hard would it be to test working postfix but refusing connections from the upstream relay?
19:31 pameyer for the slow shib logins, postfix was file - but the relay server was refusing connections
19:33 donsizemore all i've ever done with glassfish is either a local postfix mta or pointed glassfish straight at relay.unc.edu
19:34 pameyer ... feeling like I might be mixed up.  let me check some configs
19:34 pdurbin yeah, publish is split into multiple commands these days
19:35 pdurbin Dataverse 4.12 spotted in the wild: http://phoenix.dataverse.org
19:35 pameyer was mixed up.  slow shib was with mail-resource to our local mail relay; but that relay was configured to refused connections from the dv host
19:36 donsizemore @pdurbin i've been watching the github traffic =)
19:37 pameyer wondering if it might fail harder when it thinks it can relay, but can't.  but also maybe not worth too much time intentionally breaking stuff to figure how it fails
19:37 pameyer 83-origin/develop-d87c21d ...
19:41 pdurbin hmm? 83?
19:42 pameyer copy/paste typo - 383-origin
19:43 pdurbin what about it? I'm confused
19:44 pameyer curl http://phoenix.dataverse.org/api/info/version
19:44 pameyer {"status":"OK","data":{"version":"4.12"​,"build":"383-origin/develop-d87c21d"}}
19:44 pameyer the release binaries usually have a different looking build tag
19:45 donsizemore @pdurbin @pameyer i am confident that when i return from happy hour my changes to the 37_sampledata branch will succeed. in the mean time... you gentlemen have a great weekend!
19:45 pameyer @donsizemore enjoy happy hour (and the weekend)
19:45 pdurbin oh, oh
19:46 donsizemore @pdurbin i've been doing steady testing, just not so much committing
19:46 pdurbin you too, thanks
20:26 pdurbin I'm heading out too. Have a good weekend, everyone.
20:37 pdurbin left #dataverse
22:55 pameyer a good weekend to all

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.