IQSS logo

IRC log for #dataverse, 2019-01-31

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:49 pdurbin joined #dataverse
02:11 xarthisius pdurbin: no worries, we can specify dataverse installations manually. Miniverese downtime just exposed a flaw in a logic that I've implemented
02:12 xarthisius but there's a fix for that https://github.com/whole-tale/girder_wholetale/pull/223
02:22 pdurbin xarthisius: ah, well at least you got a bug fix out of it. :)
04:58 jri joined #dataverse
05:41 jri joined #dataverse
09:35 poikilotherm joined #dataverse
12:22 pdurbin new blog post: https://dataverse.org/blog/dataverse-410-release-includes-new-features-internationalization-support-large
12:23 poikilotherm Good morning :-)
12:23 poikilotherm Cool :-
12:23 poikilotherm :-)
12:27 pdurbin Good morning. I can't take and credit for writing it but it seems accurate.
12:28 pdurbin any*
13:54 jri joined #dataverse
14:08 poikilotherm *sigh* seems like I need to patch all of those shell scripts here https://github.com/IQSS/dataverse/tree/develop/scripts/api
14:08 poikilotherm All using "localhost:8080" :-/
14:08 donsizemore joined #dataverse
14:11 poikilotherm pdurbin: what about that repo? I chatted with my colleague about this and we will try to use existing OSS toolset instead of internal stuff.
14:11 poikilotherm So this might be helpfull for others...
14:13 pdurbin poikilotherm: you want me to create that repo?
14:14 poikilotherm If you don't mind :-)
14:14 pdurbin poikilotherm: sure, here you go: https://github.com/IQSS/dataverse-kubernetes
14:14 poikilotherm Thx :-)
14:15 poikilotherm May I get some permissions?
14:15 pdurbin Sure, I'm adding you to https://github.com/orgs/IQSS/teams/dataverse-kubernetes-admin
14:15 poikilotherm Thx! :-)
14:16 poikilotherm You might be interested adding @bronger, too
14:16 poikilotherm That's my colleague and he is all into Kubernetes ;-)
14:17 pdurbin sure. invited
14:17 poikilotherm Thx
14:18 pdurbin Looking forward to seeing what you come up with!
14:25 MrK joined #dataverse
14:37 MrK Hi, are you guys still interested in flyway? I have some time for it now :P.
14:39 pdurbin MrK: yes! I am, at least. :) poikilotherm too, I think. Over at https://github.com/IQSS/dataverse/issues/5344 I asked what the status is. Please leave a comment and I can drag it over to code review for Leonid and others to read it.
14:39 poikilotherm MrK: yeah, please go on :-D
14:40 MrK pdurbin: ah yeah that's true, sry I haven't responded in some time I had other tasks :P
14:41 pdurbin no worries, I'm just saying that a comment would be helpful and that I can drag it over for feedback at standup in 90 minutes or so
14:41 MrK ok
15:23 pameyer joined #dataverse
15:37 pameyer @donsizemore publication problems in develop doesn't sound good
15:39 donsizemore @pameyer it was a "me" problem
15:40 pameyer @donsizemore those happen to all of us
15:41 poikilotherm hey pdurbin do you mind if I add an iqss organization to quay.io?
15:41 poikilotherm I need a registry for this stuff and I think it would be cool to have it under iqss flags
15:41 poikilotherm dont want to reuse your dockerhub account, because that seems to official....
15:42 poikilotherm could use iqss-k8s instead if you want
15:42 poikilotherm ah damn, no - in names
16:30 pdurbin whoops, missed him. I don't know what quay.io is.
16:30 * pdurbin looks
16:30 pdurbin "Quay [builds, analyzes, distributes] your container images" https://quay.io
16:31 pdurbin donsizemore: phew
16:34 pdurbin xarthisius: when you have a minute, can we please chat about the Whole Tale button?
16:50 pdurbin interesting list of issues with Dataverse: https://github.com/CeON/dataverse/issues
16:53 pdurbin and they have a kanban board for them: https://github.com/CeON/dataverse/projects/1
16:55 xarthisius pdurbin: hi! what's up?
16:56 pdurbin xarthisius: hi! Did Craig tell you we're working on adding a Whole Tale button to https://demo.dataverse.org ?
16:56 xarthisius yup, he mentioned that
16:58 xarthisius is there anything to be done on our end ?
17:03 pdurbin xarthisius: you don't need to do anything but I wanted to let you and Craig know that we have a better idea but it'll take a little more time to implement. We're going to launch a "beta" server for stuff like this.
17:07 xarthisius sounds good! we're rolling release of our own soon^{TM}. Hopefully, it's going to make us more user friendly for people coming from dataverse world
17:08 pdurbin Great! Can you please let Craig know? He's welcome to jump in here with any questions.
17:08 xarthisius among other things, users will be able to come with data from DV, launch an environment and add additional data that's already registered on the fly
17:09 xarthisius sure, will do!
17:10 pdurbin Thanks. And that sounds like a great feature. Oh, one other question about https://github.com/whole-tale/girder_wholetale/pull/223 ... can you please remind me why you like having a list of Dataverse installations available?
17:11 xarthisius so that when someone comes with a doi that points to DV and tries to register the data, we know to call dataverse API
17:12 pdurbin That makes sense but isn't the API path the same for all installations of Dataverse?
17:13 pameyer pdurbin: whole tale button is still file level, right?
17:13 xarthisius pdurbin: it is, but we need to know that it's dataverse and not something else
17:14 pdurbin I'm not arguing and I can think of other good uses for the list of installations (OSF, RSpace, etc. could use it). But I was asked why Whole Tale is using it and I din't have a good immediate answer. /me reads. Ok, so all you have is the hostname and you don't know if it's Dataverse or not. Makes total sense. Thanks!
17:14 pdurbin pameyer: yes, file level only. All external tools are still file level only until someone works on https://github.com/IQSS/dataverse/issues/5028
17:15 pameyer pdurbin: thanks
17:17 xarthisius pdurbin: it also gives us a whitelist, which we prefer. In theory anyone can host DV and publish all sorts of naughty things there. We don't want people to create Tales out of that for sure!
17:17 xarthisius We count on you to do the vetting :P
17:21 pameyer xarthisius: whole tale will be executing code provided by dataverse, right?
17:24 xarthisius pameyer: in a sense yeah, if you have a dataset with code/data on dataverse, you can use whole tale to "grab" it and run it interactively
17:25 xarthisius AJPS is a good example we like to use in this context
17:26 xarthisius e.g. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/R3GZZW , Odum Institute is working on enhancing those with computational environment description
17:26 pameyer your "vetting" comment got me thinking....
17:27 xarthisius we run what a SecOps person would call a "platform for a remote arbitary code execution" ;)
17:28 pameyer yup - that ;)
17:28 pameyer are you thinking about blocking or limiting outgoing network connections?
17:29 xarthisius thinking: yes, having it implemented and fully secured: ...
17:29 pameyer .... is always a work in progress
17:29 pameyer glad it's on your mind though
17:30 xarthisius it's hard though, the system is by design meant to give a remote shell...
17:31 xarthisius things like bitcoin mining tend to happen, not on WT yet, but other projects designed similarly
17:31 pameyer right - but if it's all containerized, you should be able to setup the container hosts to route outgoing traffic through some kind of proxy
17:32 xarthisius or kill altogether, but then I have to explain to PIs why they can't wget things into containers...
17:32 pameyer yeah; that's the drawback
17:35 pdurbin xarthisius: good point about the whitelist.
17:36 pdurbin xarthisius: but one thing you might want to think about is that anyone can create an account on Harvard Dataverse (and I think two other installations) and publish a dataset without any curation or oversight. When we find spam has been published, we delete it.
17:38 poikilotherm joined #dataverse
17:39 poikilotherm Hey guys :-)
17:40 pdurbin hey
17:40 poikilotherm pdurbin - any thoughts on the registry stuff?
17:41 poikilotherm Dunno if you saw my first commits on the repo ;-)
17:42 poikilotherm https://github.com/IQSS/dataverse-kubernetes
17:42 pdurbin My first thought is that I created https://hub.docker.com/u/iqss/ and it's a mess and I'm sure it's so smart to repeat this with https://quay.io :)
17:42 pdurbin I did not. Thanks!
17:43 poikilotherm It is still in its very early stages... ;-)
17:43 pdurbin whoops! I meant to say I'm *not* so sure it's smart to do the same thing... to create another "iqss" org with a bunch of junk under it
17:44 poikilotherm But that container should almost be usable
17:45 poikilotherm Well, if you want I can add my junk to your junk :-D
17:45 poikilotherm Dunno if you feel comfortable sharing a password for that account
17:45 poikilotherm I could of course use a new org
17:46 poikilotherm If you guys feel like this should not be labeled with IQSS, just raise a finger ;-)
17:53 poikilotherm joined #dataverse
18:03 pdurbin poikilotherm: I'm happy to let you push to "iqss" on Docker Hub if you want. While I'm in there I should remove access from the Red Hat interns. :)
18:03 poikilotherm Ok
18:03 poikilotherm Sounds good
18:04 pdurbin What's your Docker Hub username?
18:06 pdurbin poikilotherm: ^^
18:06 poikilotherm Hmm I think I don't have one yet
18:06 pdurbin ok
18:07 pdurbin I'm doing five other things right now so please take your time. :)
18:08 poikilotherm Done
18:08 poikilotherm ;-)
18:08 pdurbin Want me to guess? :)
18:09 poikilotherm *G*
18:09 poikilotherm You have 3 chances
18:09 poikilotherm But I don't think you'll need all of em
18:11 pdurbin If it's not "poikilotherm" let me know and I'll change it. Added.
18:12 poikilotherm Thx
18:12 poikilotherm To quote Portal: "This was a triumph... HUGE success."
18:15 poikilotherm pdurbin: any chance you can grant permission to create repos for me?
18:25 pdurbin poikilotherm: you're asking me to make you an "owner". I guess I could. But when the intern wanted a new Docker Hub repository he just told me the name he wanted ("init-container") and I made it for him. Would that be ok?
18:25 poikilotherm As you prefer
18:26 poikilotherm Could you please create one "dataverse-k8s"?
18:26 poikilotherm I can fiddle around with that one, right?
18:27 pdurbin poikilotherm: sure, please fiddle away: https://hub.docker.com/r/iqss/dataverse-k8s
18:28 poikilotherm Thx!
18:28 poikilotherm Can you give me more rights on that repo?
18:28 poikilotherm I seem not to be able to link this with GitHub...
18:29 pdurbin just did, sorry, your team is "admin" now. Are you able to change the description?
18:29 pdurbin or add a full description?
18:29 poikilotherm This looks good
18:29 pdurbin cool
18:29 poikilotherm Thx :-)
18:31 pdurbin poikilotherm: are we going to demo all this at Open Science Days on Wednesday? :)
18:32 poikilotherm Oh oh you are doing pressure on me...
18:32 pdurbin heh
18:46 pdurbin Who's in a different country?
18:47 pdurbin poikilotherm: can you please download a couple files from http://ec2-54-175-102-39.compute-1.amazonaws.com:8080/ ?
18:47 pdurbin icarito[m] jri pmauduit: you too please, if you're around :)
18:48 poikilotherm pdurbin: done
18:48 poikilotherm Download both beautiful birds
18:48 poikilotherm *Downloaded*
18:50 pdurbin thanks!
19:11 poikilotherm pdurbin: did something arise from http://irclog.iq.harvard.edu/dataverse/2018-10-16#i_76113 ?
19:11 poikilotherm I'm seeing those errors now, too
19:13 pameyer I saw them earlier this week; but they didn't appear to be fatal
19:13 poikilotherm Nope, but they are really cluttering the logs
19:14 pameyer not clear on why junit stuff is getting into the distribution war file
19:14 * poikilotherm looks into this. Suspicious about some wrongly set scopes
19:15 pameyer my suspision is maven hating me - sounds like you've got better ideas ;)
19:15 poikilotherm Once Maven finishes downloading, I will ask my dear friend dependency:tree
19:16 poikilotherm Wow - when was the springframework added as a dep?
19:16 pameyer it looked like all top-levels to me
19:16 pameyer the baggit stuff added spring ...
19:16 poikilotherm Meh
19:16 poikilotherm bloaty bloat
19:16 pdurbin It looks like all I wrote was "bleh". Do you want to create an issue poikilotherm ?
19:17 * poikilotherm hides beyond a wall, only a steep voice can be heard... "Maybe?"
19:18 pdurbin We're kind of blind to noise in logs too, unfortunately.
19:18 poikilotherm Does that BagIT stuff really need to live within the same codebase?
19:18 pameyer ... everybody gets a ViewExpiredException
19:18 poikilotherm Dude this stuff is HUGE
19:20 pdurbin poikilotherm: oh, but speaking of blind spots, we adjusted Slack so that we see comments on pull requests now. Not sure about reviews on pull requests.
19:20 pameyer the war file is still smaller than the cent7 base docker image
19:22 poikilotherm joined #dataverse
19:23 poikilotherm pdurbin: good to know :-)
19:23 poikilotherm pameyer: hopefully we can ditch that sooner than later....
19:24 pdurbin Ditch what? Slack?
19:24 poikilotherm My dataverse-k8s image has about 480 megs
19:25 poikilotherm pdurbin: that ditching was about the centos base image
19:25 pdurbin Oh. :)
19:26 pdurbin But I'm still thinking Gitter might be a good idea. And I hear there's a chat plugin for Discourse.
19:27 pameyer glassfish4 seems happy in alpine; but I haven't tried dataverse with that configuration
19:27 poikilotherm Err... pameyer: sry to take my words back...
19:27 poikilotherm Centos 7 base image ~80MB
19:27 poikilotherm Dataverse 4.10.1 War ~ 190M
19:28 pameyer `centos                        7                   1e1148e4cc2c        8 weeks ago          202MB` from `docker images` on my box
19:30 poikilotherm Hmm OK so this is the compressed size on Docker Hub
19:31 poikilotherm The WAR is compressed, too. Let me see what this is uncompressed
19:31 poikilotherm Ok 198MB
19:31 pameyer poikilotherm: was more a passing observation than anything else
19:31 poikilotherm ;-)
19:32 poikilotherm Top 10 dep jars:
19:32 poikilotherm 2,2M    guava-16.0.1.jar
19:32 poikilotherm 2,5M    pdfbox-2.0.11.jar
19:32 poikilotherm 2,5M    xmlbeans-3.0.1.jar
19:32 poikilotherm 2,6M    poi-4.0.0.jar
19:32 poikilotherm 2,7M    xalan-2.7.0.jar
19:32 poikilotherm 3,1M    icu4j-3.4.4.jar
19:32 poikilotherm 4,0M    bcprov-jdk15on-1.60.jar
19:32 poikilotherm 4,0M    cdm-4.5.5.jar
19:32 poikilotherm 4,1M    primefaces-6.2.jar
19:32 poikilotherm 4,9M    grib-4.5.5.jar
19:32 poikilotherm 6,2M    poi-ooxml-schemas-4.0.0.jar
19:32 poikilotherm 8,6M    ehcache-2.10.1.jar
19:32 poikilotherm 65M     aws-java-sdk-bundle-1.11.172.jar
19:33 poikilotherm Ok, my friend dependency:tree is telling me that junit stuff is included due to wrong scopes.
19:33 poikilotherm Let me see if I can fix that
19:43 poikilotherm https://github.com/IQSS/dataverse/issues/5501
19:43 poikilotherm pdurbin can you assign me?
19:43 poikilotherm The PR is coming in a few seconds...
19:44 pdurbin sure, done
19:46 poikilotherm Just created https://github.com/IQSS/dataverse/pull/5503
19:47 poikilotherm Could you drag this to code review?
19:49 pdurbin poikilotherm: sure. Done. And thanks for downloading those files. You added "de" to https://github.com/IQSS/dataverse/commit/36fa81a for us. :)
19:51 poikilotherm LOL
19:54 jri joined #dataverse
20:31 pdurbin poikilotherm: thanks for that pull request. I moved it to QA.
20:35 poikilotherm Sure :-)
20:38 poikilotherm pameyer: Did I understand #5443 correct that I can just use Solr 7.3.1?
20:38 poikilotherm Couldn't find code changes besides the update of the dep
20:42 pameyer poikilotherm: there shouldn't have been any code changes; just a minor version bump
20:43 jri joined #dataverse
21:25 poikilotherm Hey pdurbin, could you plz repeat to create a dockerhub repo "solr-k8s"?
21:31 pdurbin poikilotherm: sure, here you go: https://hub.docker.com/r/iqss/solr-k8s
21:31 poikilotherm THX
21:31 pdurbin poikilotherm: do you have an installation of Kubernetes I can try this on?
21:32 poikilotherm Nope, sry. You could try Minikube on your Mac
21:32 poikilotherm Beware - this is not in any way functional
21:32 poikilotherm I am getting the pieces together
21:33 pdurbin ok, I've never used minikube but my understanding is that minishift (which I've used) is derived from it
21:33 poikilotherm That's entirely possible
21:33 pameyer pretty similar
21:34 pameyer all the commands have different syntax, but it seems like there's a pretty close to 1:1 mapping
21:34 pdurbin yeah
21:35 pameyer as far as I understand it, you can use kubernetes configs with openshift commands; but I don't think the other way around works
21:35 poikilotherm I dunno if you can reuse the Kubernetes YAML files right away
21:35 pameyer I also haven't actually tried it
21:36 pdurbin When we created our "ec2 create" script ( http://guides.dataverse.org/en/4.10.1/developers/deployment.html ) we very briefly entertained the idea of using Docker. But we decided to just use plain old ec2 instances. Works great.
21:37 pdurbin poikilotherm: but I can give you access to our AWS stuff if you want to try to get your k8s stuff spun up on AWS. Just lemme know. :)
21:37 poikilotherm uuuuh that's tough work ;-)
21:37 poikilotherm Would need to install k8s on that first
21:37 pdurbin I think AWS supports Kubernetes these days.
21:38 poikilotherm I read somewhere that some ansible playbooks exist for this, but I haven't looked into this yet
21:38 poikilotherm Oh, does it?
21:38 pdurbin Amazon Elastic Container Service for Kubernetes (EKS): https://aws.amazon.com/kubernetes/
21:38 poikilotherm Yeah
21:39 poikilotherm Do you see any prices in the AWS console?
21:39 pdurbin Danny gets the bill.
21:40 poikilotherm Ah I see
21:40 poikilotherm https://aws.amazon.com/de/eks/pricing/
21:40 pdurbin but like I said, just let me know if you'd like an account on our AWS org or whatever
21:40 poikilotherm 0,2$/h per cluster
21:40 poikilotherm Plus EC2 resources
21:41 poikilotherm Tempting...
21:42 poikilotherm Maybe we can have a look on this on Monday/Tuesday
21:42 poikilotherm ;-)
21:46 pdurbin sure
21:51 pdurbin poikilotherm: when your stuff is ready you can reply to Jurn and ask him to try it: https://groups.google.com/d/msg/dataverse-community/kERoDtt95PE/bfXbMPAxBwAJ :)
22:39 poikilotherm Wow Github is so slow sometimes..........
22:40 poikilotherm Downloading dvinstall.zip already taking 15 minutes
22:41 pameyer that does seem unusually slow
22:42 pameyer but I generally build dvinstall instead of downloading it
22:42 poikilotherm Yeah... preparing docker image here build from release for now
23:25 poikilotherm Hey pdurbin still around?
23:30 poikilotherm Seems like everybody is out for today... Cu monday @pdurbin. Tomorrow I'll do some roofing ;-)

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.