IQSS logo

IRC log for #dataverse, 2017-09-19

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
06:06 JonathanNeal joined #dataverse
11:38 donsizemore joined #dataverse
13:16 rebecabarros joined #dataverse
13:31 rebecabarros Hello, pdurbin and everyone else in the development team. I open a ticket (#253307) describing my problem when trying tabular ingest with large files. The server log shows that the ingest process is successful, the .tab file is created but after that I've got the message "Tabular Data Ingest Failed" in the dataset page.
13:41 andrewSC joined #dataverse
13:41 andrewSC joined #dataverse
14:24 pdurbin rebecabarros: thanks for reporting that bug and uploading the log file
14:49 Slava21 joined #dataverse
14:49 Slava21 left #dataverse
14:50 dataverse-user joined #dataverse
15:03 bjonnh that said it is really cool, there are systems to wait for the service to be ready
15:03 bjonnh etc
15:05 pdurbin yeah, it's neat
15:06 pdurbin I think I broke my Minishift installation though. Commands that were working yesterday stopped working.
15:23 bjonnh did you eval again?
15:23 bjonnh eval $(./minishift oc-env)
15:23 bjonnh ?
15:28 bjonnh pdurbin: I have to deploy a wiki today, I'm trying to do it using openshift so I can start to understand how it works
15:40 pdurbin bjonnh: I recommend deploying the "nodejs-ex" example by following https://docs.openshift.com/online/getting_started/basic_walkthrough.html . That's what I did and it worked. I was able to make a change to the index.html file and see the change.
15:56 bjonnh yes I did that
15:56 bjonnh But it is really not clear about how you push existing docker images
15:56 bjonnh etc
15:57 pdurbin Push existing docker images where? So far I've only used Docker images that are already published to DockerHub.
16:09 bjonnh I pushed my own version of mediawiki
16:09 bjonnh on the docker hub
16:09 bjonnh but then I had to push these into the minishift repository
16:09 pdurbin Ok. I haven't created a DockerHub account yet.
16:09 bjonnh (couldn't find a way to use a dockerhub one)
16:09 bjonnh so I pulled and then pushed
16:12 bjonnh but now I added a postgresql from a template
16:12 bjonnh and I don't see what I should do to create its image stream
16:13 pdurbin bjonnh: you're trying to get mediawiki working?
16:17 bjonnh yep
16:17 bjonnh I have it working as two docker containers
16:17 bjonnh for now
16:17 pdurbin nice
16:17 bjonnh And it seems I'm getting it working in openshift
16:17 bjonnh really close to it
16:18 pdurbin lemme know when you want to switch to my branch where I'm trying to get Dataverse working in Minishift/Openshift :)
16:20 bjonnh yep
16:20 bjonnh I have issues getting my apache to listen to 80
16:20 bjonnh 13)Permission denied: AH00072: make_sock: could not bind to address [::]:80
16:28 bjonnh ok you can't listen to ports < 1024
16:28 bjonnh and it seems that you can't run containers as root
16:28 pdurbin There's a trick to run containers as root.
16:32 bjonnh yeah but that's just to create the pid file for apache
16:32 bjonnh not worth it
16:32 bjonnh also it is a bad idea
16:32 pdurbin yeah
16:42 bjonnh I can't manage to get that apache started as non-root
16:43 pdurbin Want to switch to my branch? :)
16:49 cwillis joined #dataverse
16:50 pdurbin cwillis: welcome!
16:51 cwillis Good day, pdurbin
16:55 pdurbin cwillis: do you know bjonnh? He knows more about Docker than I do.
16:56 pdurbin Which isn't saying much. :)
16:56 cwillis No, I don't believe so.
16:56 pdurbin cwillis: here's the list of who's who in this channel: https://docs.google.com/spreadsheets/d/16h3jv24usMGq18495C-JA-yNcQCKiKDa65MTraNDd7k/edit?usp=sharing
16:58 bjonnh o/
16:59 cwillis pdurbin: Handy, thank you.  I'm happy to see if there's some way I can help with the OpenShift deployment.  I haven't used OpenShift in quite a while, and never to deploy our (NDS Labs) Dataverse images. Anything you can share about your current progress outside of what's in Github?
17:00 pdurbin cwillis: at the moment I'm trying to figure out where to put $POSTGRES_PASSWORD in my conf/openshift/openshift.json file.
17:00 bjonnh I got apache running
17:00 bjonnh working on parsoid
17:01 bjonnh pdurbin: can't you use shared secrets?
17:01 bjonnh I created a postgres pod and then use its secrets to populate my envs
17:01 pdurbin bjonnh: I could try that. I'm not really sure what I'm doing.
17:02 cwillis pdurbin: Where can I find your openshift.json?
17:03 pdurbin cwillis: here you go, thanks: https://github.com/IQSS/dataverse/compare/4040-docker-openshift
17:04 pdurbin Please note that I've stubbed out a Dockerfile each for glassfish, solr, etc. but right now I'm using the NDS Labs images on DockerHub.
17:05 pdurbin I also added some stuff about OpenShift to our dev guide in that branch. If that helps you understand where my head is. :)
17:08 cwillis Great, thanks. I'll take a look.
17:09 pdurbin cwillis: sure. It's awesome to hear that you've played with OpenShift before.
17:16 pdurbin What's weird is that when I try to put $POSTGRES_PASSWORD into my openshift.json file, I start getting "Required service Postgres not running" so I feel like I'm taking a step backwards. :/
17:24 bjonnh well I can't get my pods to talk to each other
17:44 pdurbin I just left a comment about how I tried and failed to add environment variables: https://github.com/IQSS/dataverse/issues/4040#issuecomment-330617208
18:02 donsizemore joined #dataverse
18:16 bjonnh I'm slowly getting there
18:16 bjonnh I'm using the GUI though
18:16 bjonnh I hope I'll be able to dump the stuff out later
18:34 pdurbin bjonnh: I documented my use of `oc logs` here if it's helpful: https://github.com/IQSS/dataverse/commit/3ab921c
19:05 bjonnh pdurbin: FYI, localhost isn't usable in the podts
19:05 bjonnh pods
19:06 pdurbin no? ok
19:06 pdurbin oh hey, donsizemore. how's it going?
19:06 bjonnh I had to change my parsoid config to 127.0.0.1
19:08 pdurbin ah, ok
19:08 bjonnh I have to see if it is specific for that image though
19:19 bjonnh pdurbin: mediawiki up and running
19:19 bjonnh I'm afraid all of that will be way more complicated when using a real install of openshift
19:22 pdurbin bjonnh: nice. Want to hop on my branch and help me? It's called 4040-docker-openshift.
19:22 bjonnh yep
19:23 pdurbin bjonnh: cool. Are you on a Mac?
19:24 * bjonnh clones
19:24 bjonnh no
19:24 bjonnh linux
19:24 pdurbin hmm. ok
19:24 bjonnh does that matter?
19:24 pdurbin cwillis: how about you? Are you on a Mac?
19:24 pdurbin Well, I'm wondering what the best way to install Docker on a Mac is.
19:25 bjonnh oh
19:25 bjonnh no idea
19:26 bjonnh replace mac os by linux maybe? ;)
19:26 donsizemore joined #dataverse
19:26 bjonnh I remember checking that for someone
19:26 bjonnh it depends on which version of mac os you are using
19:26 bjonnh there is a VM approach for older macos
19:26 bjonnh and a native for new ones
19:27 pdurbin I'm on OS X 10.11 (El Capitan)
19:27 bjonnh https://www.docker.com/docker-mac ?
19:27 bjonnh also what's the current status for dataverse-openshift?
19:38 bjonnh ok I'm going to make some corrections on yoru doc
19:39 bjonnh and I don't understand why you do: $(oc get po -o json | jq '.items[] | select(.kind=="Pod").metadata.name' -r | grep -v dataverse-glassfish-1-deploy)
19:39 bjonnh In my case, the only one I have is dataverse-glassfish-1-deploy
19:39 pdurbin huh, that works? lemme try
19:41 pdurbin oh, oh, the one I'm trying to get logs for or rsh into has a crazy, variable name
19:41 pdurbin that's why I go look it up
19:41 bjonnh echo $(oc get po -o json | jq '.items[] | select(.kind=="Pod").metadata.name' -r )
19:41 pdurbin dataverse-glassfish-1-wgh87 or whatever
19:41 bjonnh dataverse-glassfish-1-deploy
19:41 bjonnh for me ^
19:41 bjonnh ohh you mean, while it is being deployed?
19:41 bjonnh it took a few seconds for me
19:42 pdurbin yeah, it takes a few seconds
19:42 bjonnh --> Waiting up to 2m0s for pods in rc dataverse-glassfish-1 to become ready
19:42 bjonnh error: update acceptor rejected dataverse-glassfish-1: pods for rc "dataverse-glassfish-1" took longer than 120 seconds to become ready
19:42 bjonnh hmmm
19:42 bjonnh or maybe it did just fail
19:42 pdurbin yeah, I get that a lot. I just keep deleting project1 and retrying
19:43 bjonnh hah
19:43 bjonnh ok
19:45 pdurbin anyway, I run `oc logs -c ndslabs-dataverse dataverse-glassfish-1-wgh87` or equivalent and see `Invalid resource : jdbc/VDCNetDS__pm.` which is because `create-jdbc-connection-pool` is failing because the environment variables are null ("missing property value: password=").
19:46 pdurbin Does that make sense? That's the current status. Broken.
19:49 bjonnh that make sense
19:49 bjonnh I wonder why I can't see the project in the gui
19:49 pdurbin are you logged in as "system" or "developer"
19:53 bjonnh developer
19:53 pdurbin huh, if I log into the GUI with "developer" I can see the apps I create with `oc` which also uses the "developer" account
19:56 pdurbin This is what I see when I run `oc get all`: https://paste.fedoraproject.org/paste/sH2du0YDncJlgb4Tiz89JQ/raw
19:57 bjonnh oc get all works
19:57 pdurbin The GUI is nice. Otherwise you're sort of flying blind.
19:59 bjonnh oh wrong user
19:59 bjonnh damn
19:59 pdurbin You can see it now?
20:11 bjonnh yes
20:11 bjonnh I think the timeout should be > 120s
20:11 pdurbin ok, I'm not sure how to change it
20:12 bjonnh timeout line when doing edit yaml
20:12 bjonnh however I don't understand why I don't see all the pods already
20:12 bjonnh /entrypoint.sh: line 101: cd: //dvinstall: No such file or directory
20:13 bjonnh sounds like something is missing here
20:13 pdurbin bjonnh: ah, you get that dvinstall error when you don't run the container as root
20:15 bjonnh
20:16 bjonnh cd ~/dvinstall
20:16 bjonnh that's the culprit
20:16 bjonnh in entrypoint.sh
20:17 bjonnh https://github.com/nds-org/ndslabs-dataverse/blob/master/dockerfiles/dataverse/entrypoint.sh#L101
20:17 pdurbin yep
20:17 pdurbin my workaround for now is to let the container run as root
20:17 bjonnh there is no home in these images
20:17 bjonnh I'll remake a new container instead
20:18 pdurbin bjonnh: ah, that's what I'd like help with. How to remake the container.
20:19 bjonnh wait
20:19 bjonnh why should the setup ask for being root?
20:20 bjonnh cwillis: https://github.com/nds-org/ndslabs-dataverse/blob/master/dockerfiles/dataverse/dataverse-init#L72
20:20 bjonnh any idea why?
20:21 pdurbin bjonnh: the version of Dataverse is so old that the installer used to require root. It doesn't anymore.
20:21 bjonnh oh
20:21 bjonnh should we instead redo everything with the latest one?
20:22 bjonnh there is a lot of stuff in this init that doesn't make sense with kubernetes
20:22 bjonnh like testing if pgsql is up
20:22 cwillis Sorry, was sucked into meetings
20:22 pdurbin cwillis: no worries, we're just talking about what to do next
20:23 pdurbin bjonnh: in that branch I stubbed out a Dockerfile each for glassfish, solr, etc.
20:24 cwillis bjonnh: The ndslabs images were based on 4.2.3 and I hackishly separated out the installation and configuration steps.  The root user wasn't a concern for us. These images can all easily be updated -- but I'd prefer NDS Labs to use an official set of Dataverse images.
20:25 bjonnh cwillis: yes that's entirelly the idea here
20:26 cwillis We're still on 1.5.x, but from my understanding Kubernetes still doesn't handle startup dependencies. So, for example, if you need Postgres to be running before Dataverse starts, we loop until we can connect to the PG instance.  Otherwise, the container fails and restarts, which is also an option.
20:28 bjonnh openshift does handle them no?
20:29 cwillis That I don't know.
20:30 bjonnh well making it restart is fine
20:30 bjonnh I think…
20:31 pdurbin For what it's worth, I just created https://hub.docker.com/u/iqss/ but it's empty. I don't really know how to create Docker images. Also, I'm interested in someday creating Docker images for each release of Dataverse but I'm also interested in creating Docker images of arbitrary branches that pull requests are based on. I would run tests against them.
20:32 bjonnh cwillis: https://docs.openshift.org/latest/dev_guide/templates.html#waiting-for-template-readiness
20:34 cwillis pdurbin: Creating the Docker image requires creating a Dockerfile (e.g., https://github.com/nds-org/ndslabs-dataverse/blob/master/dockerfiles/dataverse/Dockerfile) then doing a "docker build" and "docker push" to that repo.  You can also configure Dockerhub to automatically build a Github repo containing a Dockerfile.
20:34 cwillis For Dataverse, I think automatic builds would make sense in the long run.
20:35 bjonnh agreed
20:35 bjonnh but the integration of github and dockerhub isn't that easy
20:35 pdurbin Ok, does it make sense to do a "docker build" and "docker push" to DockerHub for each pull request so we can test them before merging?
20:35 bjonnh and I recommend that you use only publicly accessible repos
20:35 bjonnh and it doesn't work with "company" github, you have to run it on a simple account
20:36 bjonnh at least I didn't manage to get it working
20:36 cwillis pdurbin: You can automatically build tags and branches as well.  We usually do this with the Docker image tag the same as the Github tag.  We use this model extensively with NDS Labs.
20:36 bjonnh pdurbin: you can build a dataverse:test
20:36 bjonnh pdurbin: and push to that
20:36 bjonnh that way you don't annoy people using the dataverse:latest
20:37 dataverse-user joined #dataverse
20:38 pdurbin cwillis: can you please link me to a example of a tag or a branch under https://hub.docker.com/u/ndslabs/ ?
20:39 pdurbin nevermind, I think I found one: https://hub.docker.com/r/ndslabs/deploy-tools/tags/ ... latest, 1.0.12, NDS-343.82bece3ac972, etc
20:39 pdurbin so I could create a DockerHub tag based on the name of a branch which is great
20:39 bjonnh yes
20:40 pdurbin so my first DockerHub tag could be the branch I'm on: "4040-docker-openshift"
20:41 bjonnh you could do that yes
20:43 bjonnh « If this is a production evnironment, you may want to change it back to something more secure, such as “password” or “md5”, after the installation is complete. »
20:43 bjonnh hah
20:44 bjonnh at first, I was reading "you should set your password as…"
20:45 bjonnh pdurbin: https://blog.openshift.com/running-java-apps-in-the-cloud-with-glassfish-and-a-paas/
20:45 bjonnh (it is really old)
20:51 bjonnh pdurbin: do you want to keep glassfish?
20:51 bjonnh pdurbin: I was reading: https://github.com/IQSS/dataverse/issues/2628
20:52 pdurbin bjonnh: well, we have to deploy dataverse.war to some sort of app server. We've always used Glassfish.
20:52 bjonnh I see you had success with paraya?
20:52 pdurbin Don did.
20:52 bjonnh payara
20:53 pdurbin Don had success with Payara.
20:54 bjonnh https://hub.docker.com/r/payara/micro/
20:54 bjonnh this sounds… cool
20:54 bjonnh read: Build a new docker image to run your application
20:54 bjonnh FROM payara/micro
20:54 pdurbin I ran `docker build` and saw "Successfully built f13fa38ec3ee" but `docker ps` doesn't show anything. Hmm.
20:55 bjonnh COPY dataverse.war /opt/payara/deployments
20:55 bjonnh pdurbin: yes docker build just made the image
20:55 bjonnh you have to docker run
20:55 bjonnh docker run f13fa38ec3ee
20:55 bjonnh (note that you can also give them names)
20:55 bjonnh you can also do docker tag f13… pdurbin/mytest
20:55 bjonnh IIRC
20:55 pdurbin ok, how can I see "f13fa38ec3ee" if it scrolled past?
20:56 pdurbin ah, `docker image ls`
20:59 cwillis pdurbin: I've had some luck deploying in OpenShift locally -- JSON file and commands here: https://gist.github.com/craig-willis/d3a74df01d0cf773a1d04dc825fdad3c
20:59 cwillis The expose isn
20:59 cwillis The expose isn't right yet, but working on it.
20:59 cwillis The only change from your JSON file was to add the env section under the glassfish app with the value from the postgres section.
21:01 pdurbin cwillis: thanks! I thought I tried that but let me try your file (but I'll change the SMTP_HOST).
21:04 bjonnh any reasons you use json for the config?
21:04 bjonnh I've only seen yaml stuff for openshift
21:05 pdurbin The examples that danmcp keeps linking to in issue 4040 are in JSON.
21:05 cwillis We typically use yaml for Kubernetes, but it accepts both.  I assume OpenShift is the same.
21:06 pdurbin cwillis: wow, I think your JSON file worked. I'm not sure what I was doing differently.
21:11 pdurbin When I rsh into the glassfish container this works: curl localhost:8080
21:11 pdurbin which is great, but I can't get "expose" working either. I tried this: oc expose svc/dataverse-glassfish-service
21:24 cwillis pdurbin:  My initial mistake was to use the env var names from the postgres container directly (POSTGRESQL_PASSWORD instead of POSTGRES_PASSWORD expected by init-glassfish)
21:26 bjonnh pdurbin: can you check on which ip it is listening ?
21:26 bjonnh pdurbin: you have to make your container listen to 0.0.0.0 or OPENSHIFT_DIY_IP
21:26 bjonnh or something like that
21:28 pdurbin cwillis: yeah, I noticed that difference. I'm still not sure why my version didn't work but I think I'll push this.
21:29 pdurbin bjonnh: I'm pretty sure I'm able to expose that node-ex app I forked. I guess I could test that.
21:29 cwillis pdurbin: On the expose issue:  you need a selector on the service to tell it which container to use.
21:30 pdurbin cwillis: oh, did you get it working?
21:31 bjonnh also I guess you want to build with oracle java?
21:32 bjonnh or can we go with openjdk now?
21:32 pdurbin On Mac I use Oracle Java but on Linux I tend to use OpenJDK.
21:32 cwillis pdurbin: yes.  I've updated the gist with the changed JSON.  You can also add the selector using "oc edit svc/dataverse-glassfish-service" (also in gist as yaml)
21:35 bjonnh I'm wondering if we could use a building docker that make a waf in a volume
21:35 bjonnh and then have a simple docker to serve it
21:36 pdurbin cwillis: huh. Ok, I'm adding "selector" and re-running.
21:41 cwillis pdurbin: Our system also auto-generates an admin password using the ADMIN_PASSWORD environment variable, which will be blank by default with the current OS json.  I'm testing now, but you'll probably want to set ADMIN_PASSWORD with "admin" or other default.
21:42 pdurbin It works!!
21:42 pdurbin cwillis: but yes, I can't log in. Will try that. Thanks!
21:45 bjonnh pdurbin: https://hub.docker.com/_/maven/
21:46 bjonnh $ docker run -it --rm --name my-maven-project -v "$PWD":/usr/src/mymaven -w /usr/src/mymaven maven:3.2-jdk-7 mvn clean install
21:46 bjonnh running that (by replacing with jdk-8) seems to build dataverse in the current directory
21:46 cwillis pdurbin: that did it for me. I've updated the gist, but adding ADMIN_PASSWORD env var should work.
21:49 pdurbin cwillis: ok, trying it now. Does it make sense to delete "project1" each time like I'm doing? I mentioned this in issue 4040. If there's a better way to iterate on the JSON file I'm open to it.
21:49 pdurbin bjonnh: sorry, what about maven?
21:50 bjonnh I was thinking about having a building system
21:50 bjonnh that makes the WAF
21:51 bjonnh and another container that uses it
21:52 cwillis pdurbin: you can always edit individual components of the spec directly (e.g., oc edit svc/x).  In Kubernetes, you can run "kubectl delete -f <file>" but "oc delete -f conf/openshift/openshift.json" doesn't work for me.  I've been deleting the project...
21:52 pdurbin bjonnh: WAF or WAR?
21:52 pdurbin cwillis: same. ok, thanks
21:54 cwillis bjonnh: Have you considered building the war and attaching as part of a Github release, then using the release artifact in the Dockerfile?  We've used the "build container" model (e.g., using maven to build), but it's not particularly satisfying.
21:56 bjonnh well pdurbin would be the one to know if it has been considered
21:56 bjonnh I like the idea of being able to have the WAR on github directly as a release
21:56 bjonnh and have a "nightly" for example
21:57 bjonnh pdurbin: I'm trying to use that payara thing see if we can get rid of glassfish
22:00 pdurbin sorry, I was excitedly posting screenshots to all this stuff working at https://github.com/IQSS/dataverse/issues/4040
22:00 bjonnh :D
22:01 bjonnh I would make "latest" point to the latest stable
22:01 pdurbin Yes, I'd like the war file to be built based on the branches behind pull requests. We could also build a nightly war file from the "develop" branch, which is our integration branch.
22:01 bjonnh because that's usually what I expect from latest
22:01 pdurbin For us, the stable branch is "master" which we keep the same as the current release.
22:01 bjonnh ok
22:01 bjonnh I've only ever touched develop
22:02 pdurbin see also http://guides.dataverse.org/en/4.7.1/developers/version-control.html#branches
22:03 cwillis pdurbin: The "Required service Postgres not running" means that the Postgres container hasn't started completely by the time Dataverse starts.  In Labs Workbench, we have startup dependency ordering.  bjonnh noted that OpenShift has something similar (I've never used).  In the end, Kubernetes will restart containers if they fail -- in this case, my Dataverse/Glassfish container restarted once and Postgres was up.
22:04 cwillis So, you may continue to see this error sometimes during startup, but it may resolve with restarts.  It would be worth looking into OpenShift's startup ordering to see if you can hold off on starting the Dataverse/glassfish container until after Postgres is ready.
22:04 pdurbin cwillis: yeah, I'm realizing now that earlier I may have simply been impatient. My version may have worked if I had given it more time.
22:05 cwillis Sorry for the unclear error messages...
22:05 pdurbin yes, that's a good idea
22:05 pdurbin they were spot on when I hadn't even added a postgres container yet :)
22:06 bjonnh I'll have to ask donsizemore how he got payara running
22:08 bjonnh I get: Invalid resource : jdbc/VDCNetDS__pm
22:08 pdurbin bjonnh: if it helps, he put specific version numbers at https://github.com/payara/Payara/issues/532#issuecomment-315408770
22:08 bjonnh maybe I forgot a bit
22:08 bjonnh does the WAR file contains everything when I do mvn package?
22:09 pdurbin yep
22:09 pdurbin well, depends on what you mean by "everything" :)
22:10 cwillis pdurbin: I'm off -- mention me on github if I can be of further use.
22:10 pdurbin cwillis: thank you, sir!
22:10 bjonnh "Cannot find javadb client jar file"
22:10 bjonnh looks like I have a part missing
22:12 pdurbin hmm, not sure, I haven't really used payara
22:13 bjonnh I shouldn't have used the micro version maybe
22:17 bjonnh ok the install script is really doing a LOT of things I have no idea about
23:04 pdurbin yeah

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.