IQSS logo

IRC log for #dataverse, 2018-10-02

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:29 donsizemore joined #dataverse
00:30 donsizemore @pdurbin i pronounce build/commit working, with 2 compilers per core. kick the tires?
00:32 pdurbin I'm already kicking them :)
01:38 jri joined #dataverse
05:39 jri joined #dataverse
06:41 poikilotherm joined #dataverse
07:00 jri joined #dataverse
11:47 poikilotherm Hey pdurbin - do you even sleep at nighttime? Thx for moving things in #5122
11:49 pdurbin Thanks for making that pull request! I hope it works. I'll try it when I get to work.
11:51 poikilotherm Sure :-)
11:53 poikilotherm By the way - I had a chance to get in touch with Claus-Peter Klas at GESIS. They are at a conference right now, but he will get in touch with me shortly after. Keep fingers crossed we can sort that thing out...
11:58 pdurbin Nice, thanks for the update.
11:59 donsizemore joined #dataverse
12:14 poikilotherm pdurbin - just updated #4259 (Java 11 issue)... The integration testing thing ist biting us again...
12:16 andrewSC joined #dataverse
12:29 donsizemore @pdurbin and i dutifully requested my DataCite test accounts
12:30 pdurbin poikilotherm: we can pull donsizemore into the Java 8 EOL discussion as well :)
12:31 donsizemore @pdurbin i read @poikilotherm's note about the 11 announcement with a glad heart
12:32 poikilotherm Just mentioned him in #4259 ;-)
12:33 jri joined #dataverse
13:05 poikilotherm pdurbin did I get the right impression that you are in charge of the Jenkins CI server?
13:09 jri_ joined #dataverse
13:12 jri_ joined #dataverse
13:12 pdurbin poikilotherm: I don't even have root on our Jenkins server but I do have the ability to create jobs on it. As I mentioned on the last community call, almost three years ago I chained together some Jenknins jobs to deploy code to a server I set up called "phoenix": https://groups.google.com/d/msg/dataverse-community/X8OrRWbPimA/wcmQx5VZCQAJ
13:27 poikilotherm But you were the one who setup the current integration testing stuff?
13:29 pdurbin poikilotherm: yes, that's what that long thread on the mailing list is about
13:29 poikilotherm Oh sry, didn't read that yet
13:29 poikilotherm Will do so first before asking more questions... :-D
13:33 pdurbin poikilotherm: no, it's fine. Ask away. I guess I just like giving a sense of history. :)
13:36 poikilotherm Ok, read. :-)
13:39 pdurbin poikilotherm: when you have a moment, let's talk about "connects to" and Waffle :)
13:39 poikilotherm Sure :-)
13:40 pdurbin You're aware of how these days GitHub shows you history on issue comments. History of edits?
13:41 poikilotherm Yes, now I am :-)
13:41 poikilotherm Hadn't used this yet...
13:42 pdurbin Heh. Good. So I'm going to edit your pull request description. One sec.
13:42 poikilotherm Which one?
13:42 pdurbin done: https://github.com/IQSS/dataverse/pull/5127
13:42 poikilotherm 5127?
13:42 poikilotherm Thx
13:42 poikilotherm Ok, I see your change.
13:43 poikilotherm Is this the Waffle way to express "related"?
13:43 poikilotherm Or do you want to skip the connection for only slightly related issues?
13:43 pdurbin yeah, we mention "connects to" at http://guides.dataverse.org/en/4.9.3/developers/version-control.html#make-a-pull-request
13:44 pdurbin "The “connects to #3728” syntax is important because it’s used at https://waffle.io/IQSS/dataverse to associate pull requests with issues."
13:44 poikilotherm Jepp
13:44 poikilotherm That's why I used it...?
13:44 pdurbin There are a few more assumptions built in that we haven't communicated.
13:45 pdurbin - Pull requests should always have an issue associated with them. Ideally only a single issue.
13:45 poikilotherm Ah!
13:45 poikilotherm :-D
13:45 poikilotherm Maybe this should be added to the PR template ;-)
13:47 poikilotherm Writes 100 times to the chalk board: "Will not connect more than issue in pull request evermore."
13:47 poikilotherm -to +on
13:48 pdurbin Yeah, it seems like no matter what we write, people are a little confused. We last talked about this at https://github.com/IQSS/dataverse/issues/3729 I believe.
13:48 pdurbin Heh, it's all good. I'm going to upload a couple screenshots to the pull request to illustrate where the confusion happens.
13:49 poikilotherm Alright.
13:49 poikilotherm Now let me share some questions about your Jenkins CI job.
13:49 poikilotherm What would you like to see for the future?
13:50 poikilotherm Should this still be the future reference?
13:50 poikilotherm Do you think this is the "source of truth", too?
13:50 poikilotherm (as pameyer does)
13:51 pdurbin sorry, let's wrap up the "connects to" thing first. I just added a screnshot to https://github.com/IQSS/dataverse/pull/5127#issuecomment-426281222
13:51 poikilotherm While I can contribute to a Travis CI job, I cannot to a private Jenkins. Any thoughts on this?
13:51 poikilotherm Ok
13:52 pdurbin Every developer's goal is to get their code merged. I try not to confuse QA along the way. :)
13:53 poikilotherm Yes, you are absolutely right. Again: sorry for the inconvieniences and the noise generated *make a kotau*
13:53 pdurbin poikilotherm: ok, and related to all this issue housekeeping is that people find it confusing at standup to see that you're still assigned to issues that are in QA when you're done with the branch. So we'll be unassigning you like we do everyone else.
13:54 pdurbin heh, it's fine
13:54 poikilotherm Noted. :-) That's ok for me - I don't want to be a burden. ;-)
13:54 pdurbin I'm just trying to keep things moving across the board as effortlessly as possible.
13:55 poikilotherm And you are doing a very wonderfull job!
13:55 pdurbin My coping mechanism for finding issues I worked on is to start with the list of pull requests and then find the issue linked from it.
13:55 poikilotherm It is a pleasure to have a dedicated community partner that is responsive and very active.
13:56 pdurbin I guess I'm sort of the self appointed community guy. Dunno if you catch that I send a monthly newsletter to the google group.
13:56 poikilotherm I didn't yet register with the mailing list *blushes*
13:56 pdurbin But I don't do all the community stuff. I don't run the biweekly calls. I don't organize the annual community meeting. I help where I like helping. :)
13:57 pdurbin well, the archives are public. here's the news I sent yesterday: https://groups.google.com/d/msg/dataverse-community/urEFcgCtt2s/ylLHp5gdAQAJ
13:59 pdurbin poikilotherm: anyway, you were asking about the source of truth for testing. I'd say that Travis is more of a source of truth for unit tests. I'm quick to link to a failing build on Travis as evidence that there's something wrong with a pull request, for example.
14:00 poikilotherm Would it make sense to you to invest more ressources into using Travis also for other stuff than "just" the unit tests?
14:00 poikilotherm Or would you prefer to stick with the Jenkins approach?
14:01 poikilotherm What do you think others might prefer in your dev group?
14:01 poikilotherm Maybe there is some integration or other stuff I am simply not aware of and thus makes using Travis a bad idea
14:03 pdurbin I'd prefer Travis over Jenkins. I like the idea of all the config being in the .travis.yml file rather than hidden away in some Jenkins config that I don't have an easy way of sharing.
14:10 poikilotherm Do you think others at IQSS core team would follow that approach?
14:11 pdurbin Buh. I have no idea.
14:12 pdurbin Which approach would you prefer?
14:16 poikilotherm Actually Travis. Exactly because of the sharing.
14:17 poikilotherm I can contribute to a shared manifest, but not to a Jenkins job.
14:17 poikilotherm Everybody can see the shared config and reuse or play with it.
14:17 donsizemore joined #dataverse
14:18 pdurbin yeah, better transparency
14:18 pdurbin poikilotherm: you might like my article about transparency: https://opensource.com/open-organization/17/11/transparency-dataverse-project
14:19 poikilotherm I think that is one of the key factors why Travis and GitLab CI are so evolving and widely used.
14:20 pdurbin yeah, GitLab even published their employee handbook: https://about.gitlab.com/handbook/
14:21 poikilotherm Oh yes - that article definitly is pointing out the core of the hole story. A change in culture is badly needed, to get away from the habbit "that's all mine" and "OMG what if I did something wrong"
14:22 poikilotherm Thx for sharing :-)
14:22 pdurbin sure, I had fun writing it
14:24 poikilotherm Alright... Then I will try to think in that way. Maybe with a second container first... Arquillian seems to scare the crap out of people :-D
14:26 poikilotherm But first things first... Need to tackle down the failing unit tests first.
14:34 pdurbin what failing unit tests?
14:37 poikilotherm https://github.com/IQSS/dataverse/issues/4259#issuecomment-425534178
14:39 pdurbin Oh, moving from one LTS Java to another. We should change the title of that issue from "Java 9 Upgrade" to anything more reasonable. Any suggestions?
14:40 poikilotherm Maybe just switch from 9 to 11?
14:40 poikilotherm "Java 11 Upgrade" sound reasonable
14:40 pdurbin Sure. But should we add "Java 8 EOL January 2019"?
14:41 pdurbin donsizemore: any thoughts?
14:41 poikilotherm As you prefer - sounds good, too
14:41 pdurbin Done.
14:42 pdurbin But won't Red Hat keep patching Java 8? Keep releasing RPMs with security fixes?
14:42 donsizemore @pdurbin i'm late to the game but i'm all for LTS
14:43 pameyer joined #dataverse
14:44 pdurbin Would people here be nervous about running Java 7 RPMs from Red Hat/CentOS?
14:45 poikilotherm pdurbin: yes they will do so.
14:45 pameyer I'd assume things would break horribly if I tried to run dataverse on java7
14:46 poikilotherm I am not sure if the OpenJDK will receive more bugfixes in the 8 train. For 11 Oracle and other will keep up the work. But as Oracle will not release free updates for their JDK anymore, I don't know for how long the OpenJDK 8 will last...
14:48 poikilotherm Oh pdurbin - I just realized that the title is missleading.
14:48 pdurbin In February 2019 would you rather be running Dataverse on Java 8 or Java 11? Does it matter to you?
14:48 pameyer just catching up on the chat log
14:48 poikilotherm Java 8 is not EOL, only Java 11 as next LTS is now available and free support for Oracle JDK is not provided after January
14:49 poikilotherm Actually running on 8 should be just fine, as long as you do not use the Oracle JDK or have a license for that.
14:49 poikilotherm RedHat will provide support for OpenJDK till June 2023
14:49 poikilotherm But Redhat is AFAIK no upstream Java developing house...
14:50 pameyer I'd be ok with openjdk 8 - old glassfish feels like more of a problem than old jdk for a public facing service
14:51 poikilotherm Sry guys - gotta go, pick up kids. Will read up the logs later / tomorrow.
14:51 pdurbin o/
14:51 poikilotherm Cu
14:51 pdurbin I changed the title to "Java 11 Upgrade (no free security fixes for Oracle Java 8 after January 2019)"
14:56 pdurbin pameyer: yeah, old glassfish is not great
15:01 pameyer somewhat belated opinion on the travis vs jenkins vs gitlab ci question; gitlab and jenkins can both be self hosted - don't think travis ci can
15:02 pameyer and you can definately do CI pipeline as a file checked into the repo with jenkins; even though the current setup isn't doing that
15:03 pdurbin pameyer: Travis CI is open source and can be installed on prem ( https://docs.travis-ci.com/user/enterprise/installation ) but installing Jenkins is way, way more common, of course.
15:03 pameyer huh - I'd missed that about travis
15:05 pdurbin pameyer: now that we have these new EC2 scripts does that change anything with regard to spinning up Jenkins?
15:06 pameyer I don't know if the app server hosting makes a difference for the ci server hosting; but maybe I'm confused
15:06 pdurbin might be a good tech hours topic at 3
15:07 pdurbin could talk about Java 8 security after Jan 2019 too
15:11 pameyer pdurbin: those instructions have "get a license" in the before you start list
15:12 pdurbin wow, sure enough
15:14 donsizemore @pdurbin seeing your branding comment. can't you just point dataverse.harvard.edu at AWS?
15:14 pdurbin Harvard Dataverse is already on AWS.
15:14 pdurbin standup time
15:15 pdurbin donsizemore: let's chat about https://github.com/IQSS/dataverse-ansible/issues/29 soonish :)
15:16 donsizemore @pdurbin oh, i already requested test accounts
15:43 pameyer I don't know if ansible can pull secrets from environmental variables or not; I've only used it with a separate secrets file
15:45 pdurbin yeah, I don't know either
15:52 pdurbin pameyer: "zipfile corrupt" https://github.com/IQSS/dataverse/issues/5092#issuecomment-426324332
15:57 pameyer pdurbin:that makes it sound like we've got some more testing to do
16:20 jri joined #dataverse
16:50 donsizemore joined #dataverse
16:57 jri joined #dataverse
17:11 dzho joined #dataverse
17:53 pdurbin pameyer: yeah. Same number of bytes at least. I just left a another comment.
17:56 pameyer pdurbin: I saw; ~16m out for me to see how the round trip works
17:57 pameyer interrupted the transfer ~60% through and restarted
18:17 pameyer pdurbin: did you start your wget with `-c`?
18:18 pdurbin not initially
18:19 pameyer I have vague impressions that may matter
18:19 pameyer zipinfo was happy after the round-trip for me; waiting on checksums before commenting on the issue
18:19 pdurbin huh, weird. I was assuming the file on S3 is corrupt
18:20 pdurbin since the number of bytes match
18:20 pameyer if I remember right; md5 calculation for that file takes a bit - so I won't know for sure until that's done
18:24 pameyer zipinfo -h /scratch/meyer/tmp8/rt/d511.zip
18:24 pameyer Archive:  /scratch/meyer/tmp8/rt/d511.zip
18:24 pameyer Zip file size: 275292856244 bytes, number of entries: 3057
18:33 pameyer joined #dataverse
18:35 pdurbin donsizemore: 4.9.3 deployed to Harvard Dataverse
19:05 poikilotherm-at- joined #dataverse
19:05 poikilotherm-at- Good evening guys :-)
19:07 poikilotherm-at- I read up the logs...
19:07 poikilotherm-at- pameyer: about the decision betwenn Jenkins, Travis and Gitlab...
19:07 poikilotherm-at- I have been using all three of them.
19:07 poikilotherm-at- All have their pros and cons
19:08 poikilotherm-at- Actually, I don't care that much. But I suggest that you guys do more coding than other stuff and Travis CI is the only option I am aware of that is well integrated into Github out of the box and comes for free for OSS
19:10 poikilotherm-at- I really love the integration between Gitlab and Gitlab CI. I really like the plugins you can get for Jenkins. But Travis is a quick and easy way to get people in the boat, almost hassle free, without resources being spend on another service to maintain.
19:11 poikilotherm-at- If you guys decide you want to stick with Jenkins, that's alright for me - there is no use in trying to get people use a tool they hate or find to complicated. (That's not what CI is about...)
19:12 poikilotherm-at- Just let me know what you guys think and what you would all like to see being done. I will stick with that :-)
19:15 pdurbin poikilotherm-at-: hi! We're in a tech hours meeting right now and I'm hoping to get a ruling on what tool to use.
19:18 poikilotherm-at- Sounds good :-D
19:18 poikilotherm-at- Sry for disturbing... ;-)
19:19 pdurbin no worries, you aren't :)
19:54 poikilotherm-at- Leaving for today. Tomorrow is a holiday in Germany, so read you on Thursday... Good night.
19:55 poikilotherm-at- left #dataverse
20:05 jk-asu joined #dataverse
20:37 pameyer poikilotherm: my preference is pretty much for which ever kind of CI that does what it needs to do ;) no strong preferences for platforms
20:40 pdurbin slight preference for not having to host it ourselves, I guess
20:40 pdurbin we get a lot of mileage out of "free for open source" from github, travis, coveralls, etc
20:40 * pdurbin waves at jk-asu and dzho
21:35 eggsterino joined #dataverse
21:40 donsizemore joined #dataverse
23:41 jk-asu hey all, i emailed this question to the harvard dataverse support address, but might as well ask here as well
23:42 jk-asu we (arizona state university) noticed there are already submissions in harvard's instance...curious what the limitations are, especially storage...e.g. are there per-user storage quotas
23:43 pdurbin jk-asu: there's a limit of 2.5 GB per file but no quotas
23:46 jk-asu at what point would a researcher uploading data trip a "omg, this person is uploading a TON of stuff" alarm =P
23:48 pdurbin 1 TB for sure, probably sooner
23:48 jk-asu roger that
23:48 pdurbin but you should consider any answer you get from Harvard Dataverse support authoritative
23:49 jk-asu we're getting serious about embarking on the data management journey including spinning up our own instance, but doing some research on what's out there
23:50 pdurbin oh! nice! if you need any help installing Dataverse, just let us know
23:50 jk-asu will do :) am curious about your infrastructure...internal? aws? mix?
23:50 pdurbin have you seen the comparative review at https://dataverse.org/blog/comparative-review-various-data-repositories ?
23:51 jk-asu ah, thanks for sharing...some may have, but i have not
23:51 pdurbin within the last year we moved from physical servers to AWS
23:52 pdurbin we developed an S3 driver for this purpose
23:52 jk-asu awesome
23:53 jk-asu gotta run, but am sure we'll be in touch..thanks!
23:53 pdurbin o/

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.