IQSS logo

IRC log for #dataverse, 2018-11-13

Connect via to discuss Dataverse (, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:15 jri joined #dataverse
07:46 juancorr joined #dataverse
07:56 juancorr joined #dataverse
08:01 jri joined #dataverse
08:12 poikilotherm joined #dataverse
11:21 pdurbin joined #dataverse
11:27 pdurbin poikilotherm: hi! I had to run the other day so I'm not sure we came to any conclusion on your "reach out to all devs" and "direct channel" points:
13:08 isullivan joined #dataverse
13:30 donsizemore joined #dataverse
13:30 donsizemore @pdurbin beep bop!
13:44 pdurbin talk to me donsizemore
13:56 donsizemore @pdurbin i'm curious who's running harvard dataverse these days (mandy nor i could log in yesterday)
13:58 pdurbin donsizemore: is this a trick question? :) Harvard is. Or do you want names? :)
13:59 donsizemore @pdurbin i mean harvard libraries vs. iqss now that you're in amazon
14:00 pdurbin You know what we need? An about page for Harvard Dataverse.
14:00 pdurbin donsizemore: hey, you haven't configured an "About" link at
14:01 donsizemore @pdurbin well, it was throwing 504s yesterday (and again this morning)
14:01 donsizemore @pdurbin and i'm wondering how much i have the access (if not the authority) to kick if y'all are closed for say veteran's day
14:02 pdurbin Are you saying you want root on those EC2 instances? :)
14:04 donsizemore @pdurbin well, i do have an IQSS AWS key... but i didn't know if you all had taken over in EC2 or if your library group was still managing services
14:05 pdurbin It's complicated. Or at least I'm not close enough to it to know who exactly does what.
14:06 donsizemore eh, just offering. one of your nodes is wonky now, sometimes you rhyme slow, sometimes you rhyme quick
14:07 pdurbin is a 404 now
14:08 pdurbin ah, it's still at
14:08 donsizemore @pdurbin but what a beautiful 404 page
14:10 pdurbin I just found this but it doesn't saying anything about operations:
14:14 pdurbin donsizemore: the IQSS AWS access we gave you is for our dev sandbox, not production, if that helps.
14:15 donsizemore @pdurbin i figured. so... if dataverse is down, how may i report it? support@dataverse?
14:16 pdurbin If you're reporting an outage for Harvard Dataverse, you should use
14:17 pdurbin If you're reporting that the project website is down, you should use
14:18 pdurbin donsizemore: I hope that helps. Also, this page mentions LTS, HUIT, and IQSS and was recently updated after the move to AWS:
14:24 poikilotherm Hi @pdurbin and WB... :-) No, I don't think we came to a conclusion...
14:25 poikilotherm But that does not hurt...
14:25 poikilotherm It'll work out somehow ;-)
14:26 pdurbin poikilotherm: ok, I just wanted to let you know that I read what you wrote and I'm happy to respond to specific points if you want.
14:26 poikilotherm Oh sure, carry on :-D
14:27 pdurbin I should probably make sure donsizemore is all set first.
14:30 donsizemore @pdurbin i think somebody kicked it. i can log in now and i'm trying to reproduce a problem for mandy before possibly opening an issue
14:30 pdurbin donsizemore: ok. Thanks. There's also a script that kicks it sometimes.
14:30 poikilotherm Alright. :-)
14:30 pdurbin The script sent at email 3 minutes ago.
14:31 donsizemore @pdurbin i have most certainly setup a cronjob to preemptively kick mysql
14:31 poikilotherm Just cross referenced my work on #5292 in whole-tale#49 and dataverse-docker#8
14:32 pdurbin poikilotherm: ah, I'm glad you noticed the Whole Tale stuff
14:34 pdurbin poikilotherm: as I was biking through a nor'easter this morning I was listening to and I have more to say about the microprofile config stuff.
14:38 poikilotherm I am all yours :-) Enlighten me
14:39 pdurbin poikilotherm: well, first, it's two Germans talking so maybe you'd feel right at home. :)
14:39 poikilotherm LOL
14:40 pdurbin Also, near the end they talk about combining Java EE (Jakarta EE) and MicroProfile.
14:41 poikilotherm I see it from the "about" section - sounds promising (scheduling for later)
14:41 pdurbin Basically, Adam was like "WTF with MicroProfile. I don't want some special weird build of an app server just for this." Then he realized that with Payara "full" you can have all the Java EE APIs and use the MicroProfile APIs too. The new info is that OpenLiberty supports this too.
14:42 pdurbin I'm not sure how much sense that made.
14:43 pdurbin What I'm trying to say is that I support the idea of app servers supporting all of Java EE and also as much of MicroProfile as possible. The two that I'm aware of that do this are Payara and OpenLiberty.
14:43 pdurbin And those two folks have a nice chat about this on that podcast.
14:46 pdurbin poikilotherm: the two MicroProfile APIs they seem to like the best are the "config" one you found as well as "metrics".
14:48 poikilotherm Yeah, reading through issues on is pretty enlightning :-D
14:48 poikilotherm E.g.
14:48 poikilotherm Seems like the main devs of all big players of app servers are starting to support this stuff
14:49 poikilotherm I know that metrics exists, but have not yet looked into this.
14:49 poikilotherm Seems like you are in the game for giving this a try?
14:49 poikilotherm (add config api usage)
14:50 pdurbin I'm game for new and modern stuff but you're going to get in touch with Gustavo and Matthew, right?
14:51 poikilotherm I will try my best. Shall I put you in CC?
14:52 pdurbin I hate personal email for this stuff.
14:53 pdurbin Please leave me off CC for now.
14:53 poikilotherm Yes. I will just use email to get in touch and ask them if they are interested in a talk about that here on IRC or in Gitter
14:54 poikilotherm I refuse to use email for this
14:54 pdurbin Where on Gitter?
14:54 poikilotherm Dunno. Will tell them to ask you if they want Gitter :-D
14:54 poikilotherm Oh and of course I will offer sticking with GH issue discussions
14:55 pdurbin heh, my plan was only to switch to Gitter if this freenode channel gets overrun with spam
14:55 poikilotherm Alright, then I'll leave out the Gitter option
14:55 pdurbin thanks
14:56 poikilotherm Are they more on the formal side of getting addressed in an email?
14:56 pdurbin nope
14:56 poikilotherm Alright, thx.
14:56 pdurbin and I think they were both in the room when you called in
14:58 pdurbin oh, maybe it was just Gustavo:
14:58 pdurbin unless you called in a second time and I'm forgetting
15:00 pdurbin ah, you did call in a second time but you didn't have as much to say then: :)
15:05 poikilotherm Jarp...
15:06 poikilotherm The quality of the connection is not that good - often it's hard to understand what somebody is saying and I don't want to make people upset because I couldn't understand something
15:07 poikilotherm I just emailed Gustavo and Matthew. Lets see how this goes :-)
15:11 pdurbin poikilotherm: emailing about the connection quality would be good. I added a note to mention this during my next one on one with Danny.
15:12 poikilotherm I am not sure if this is a connection problem with my phone, my landline or maybe the phones/microphones of individuals who joined the conference.
15:14 pdurbin You aren't the first to mention it.
15:19 poikilotherm Reported to support.
15:21 pdurbin poikilotherm: thanks. Also, I just got a reply at
15:23 * poikilotherm read that :-)
15:23 poikilotherm Sounds promising
15:23 poikilotherm Thx for sharing :-)
15:24 pdurbin Sure. This is why Gitter is better than Slack. :)
15:27 cwillis joined #dataverse
15:29 pdurbin A presence I've not felt since...
15:30 pdurbin oh hey, cwillis
15:30 cwillis You felt a disturbence?
15:30 cwillis *ance
15:31 pdurbin I'm still the pupil.
15:32 cwillis Just stopping by because of your comment on  I had wanted to test out the compose process, but admittedly forgot that the images hadn't been pushed.
15:32 cwillis Man I'm off -- that should've been
15:33 cwillis Part of me wants to get Dataverse running using the newer Docker approach, but in the end we'll just want to test our external tools integration.
15:35 pdurbin cwillis: I'm happy to add external tools to as needed.
15:35 pdurbin cwillis: the main thing I'm thinking is that you shouldn't trouble yourself with running Dataverse unless you really want to.
15:38 cwillis Thanks, pdurbin. Part of the idea of and related work was to make it easy to spin up Dataverse instances for this type of development work.  That said, we may take you up on adding to dev1 if needed.
15:39 pdurbin cwillis: sure, and if you want to talk Docker and Kubernetes you should meet poikilotherm whose head is in this space right now. I should get back to fixing bugs. :)
15:41 cwillis Thanks again and enjoy your bugs.
15:41 pdurbin cwillis: oh, one more thing
15:42 pdurbin cwillis: in your docker/kub seteup of Dataverse you were/are using a SQL file to populate the database schema. We're working in this area now:
15:44 pdurbin As opposed to requiring Glassfish to be up. Requiring a deployment of the war file to create the database tables.
15:45 cwillis pdurbin: Thanks for pointing this out -- I did see a reference to this work at one point.  That will be much handier than my hackish approach.
15:46 pdurbin cwillis: well, my understanding is that we're using the same approach. There's a DDL file or something? The idea is to capture it.
15:47 pameyer joined #dataverse
15:49 pdurbin Anyway, hopefully those SQL files will prove to be a useful new resource for people deploying Dataverse. We'll see. :)
15:50 cwillis pdurbin:  Having this as part of the official Dataverse distribution will be very helpful for sure.
15:50 pdurbin cool, I'm glad you approve :)
15:50 poikilotherm Cu all tomorrow :-) Gotta go pick up kids.
17:03 cwillis pdurbin: To demonstrate our integration, I'd like to find a few exemplar datasets.  Ideally at least one that's data only, but maybe famous/heavily re-used and another that has both data+code. Any pointers?
17:08 andrewSC joined #dataverse
17:11 pdurbin cwillis: you're reminding me that I promised some of these to Bryce
17:12 pdurbin cwillis: here's me asking the Dataverse community for exemplar datasets:
17:14 pameyer I haven't looked, but I'd assume that something in the metrics api would let you get at highly viewed/downloaded datasets
17:14 pameyer but could be completely off base there
17:15 cwillis Thank you pdurbin and pameyer
17:15 pdurbin pameyer: it would be nice if the metrics API told us this. It only gives total downloads over time.
17:15 pameyer :( is there a guestbook api?
17:15 pdurbin cwillis: I didn't get very many replies, unfortunately, and I didn't specify that the dataset should have code :/
17:16 pameyer about 1.5 years ago, searching for datasets w\ R files in them seemed to turn up datasets w\ "code"
17:17 pameyer trying to automate a clean mapping between inputs / reported outputs / etc didn't get anywhere
17:18 pdurbin cwillis: did this news story hit your radar? . I'm asking because the data is in Harvard Dataverse ( ) but there's no code. Just MP3s.
17:19 pdurbin cwillis: in general, stuff under should have code (AJPS was in my Whole Tale talk) because the code is used to replicate the results.
17:25 pdurbin cwillis: here's the dataset I chose at random from AJPS that I put in my talk at the Whole Tale workshop: . It has Stata files but if you'd rather have R files I'm sure we can find you one.
17:26 cwillis Thank you thank you.  I'll update the issue on my end with these links for grins.
17:27 cwillis pameyer: I did a search for R and python files and found some examples, but it wasn't initially clear which were better examples. But I can do more digging based on this exchange.
17:28 pdurbin pameyer: and what was the automation effort about inputs and outputs? Was this in a GitHub issue?
17:28 pameyer cwillis: wasn't clear to me either - but pdurbin's suggestion to start w\ ajps might give you a better start
17:29 cwillis Indeed
17:30 pameyer pdurbin: this was back when I was assuming "download dataset, re-run processing, compare results with expected" was built in
17:31 pameyer no github issue; would need too much plumbing - and most of that would've been unhelpful for the use case I was trying to focus on
17:33 pdurbin pameyer: ah. Ok. Yeah, not built is but I believe you're up to speed with how we're embarking on integration with stuff like Code Ocean for reproducibility.
17:33 pameyer only tangentially
17:34 pdurbin heh, ok. well, please see for now :)
17:35 jri joined #dataverse
17:35 donsizemore joined #dataverse
17:40 pdurbin cwillis: another thought is the Code Ocean picked a dataset to reproduce from Harvard Dataverse back when they gave us a demo. I could try to dig it up.
17:40 pameyer it's in that issue
17:41 pameyer 5028
17:41 cwillis Perfect
17:41 cwillis I see it
17:41 cwillis Sorry to be such a disrupter today.  Thank you both for the pointers!
17:42 pdurbin well, no that was an example of a good readme
17:44 pdurbin This is the one I'm thinking of
17:44 pdurbin Three input files and an R file.
17:44 pameyer ah - gotcha
17:52 cwillis Thanks again for your time today! Added notes here:
17:52 pameyer cwillis: you're welcome, and good luck!
17:55 pdurbin cwillis: looks good.
17:55 pdurbin That one dataset might have more than a good readme. I do see code. :)
17:56 pdurbin but the good readme is what what emphasized by April during her workshopt
18:01 cwillis I'll look at both in more detail and update the ticket.  Since it's AJPS, I expect it's also good.
18:07 pdurbin Yep.
18:54 donsizemore joined #dataverse
19:57 poiki-at-home joined #dataverse
20:21 xarthisius joined #dataverse
22:02 pameyer pdurbin: despite prep_it having broken at some point when I wasn't watching, 5260-account-conversion-ade04c4c3 has everything in passing
22:02 pameyer thanks!
22:08 xarthisius joined #dataverse
22:08 xarthisius joined #dataverse
22:11 jri joined #dataverse
22:33 pdurbin pameyer: thanks for testing that branch
22:51 pameyer pdurbin: thanks for fixing it

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via to discuss Dataverse (, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.