IQSS logo

IRC log for #dataverse, 2019-08-22

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
01:12 JonathanNeal joined #dataverse
01:12 bjonnh joined #dataverse
01:12 icarito[m] joined #dataverse
01:12 pmauduit joined #dataverse
01:12 yoh joined #dataverse
01:13 andrewSC joined #dataverse
01:14 larsks joined #dataverse
01:14 bricas_ joined #dataverse
01:14 JonathanNeal joined #dataverse
01:14 xarthisius joined #dataverse
01:15 pdurbin joined #dataverse
01:21 rigelk joined #dataverse
01:22 icarito[m] joined #dataverse
07:27 jri joined #dataverse
07:46 stefankasberger joined #dataverse
08:11 j-n-c joined #dataverse
10:33 pdurbin stefankasberger: hi! Do you have any more thoughts on Python examples of using Dataverse APIs? I think you might already have some examples in the docs for pyDataverse.
10:54 dataverse-user joined #dataverse
11:31 stefankasberger not really. i only use the basiscs in the following order: create dataverse, get dataverse, create dataset, get dataset, publish dataset, upload datafiles, delete datafiles, delete dataset, delete dataverse
11:49 pdurbin stefankasberger: ok. I document most of those in this new (unmerged) "getting started" section: http://guides.dataverse.org/en/6086-api-guide/api/getting-started.html
11:50 pdurbin I guess I'm wondering where we could put example Python scripts. I guess I could create a new dedicated repo for this. Maybe it could be called dataverse-python-scripts or something?
11:50 pdurbin And the community could contribute to it?
13:47 pdurbin andrewSC bjonnh icarito[m] j-n-c jri larsks pmauduit stefankasberger xarthisius yoh: do any of you use Prometheus? If not, what do you use?
13:51 j-n-c pdurbin: I do not use Prometheus. Currently I do not use any monitoring tool. Some time ago I used nagios (don't know if their features are similar or apply to the same scenarios)
13:53 pdurbin I used to use nagios. And opsview, which is sort of a wrapper around nagios. I use Munin on my home server: http://munin.greptilian.com/greptilian.com/server3.greptilian.com/index.html
13:57 donsizemore joined #dataverse
13:58 j-n-c seems liek a good tool. I will have a look at it. thks!
14:02 donsizemore @pdurbin I called mine "dataverse-toolbox" though it's nascent at best https://github.com/OdumInstitute/dataverse-toolbox
14:03 pmauduit pdurbin: we do, but not in a "dataverse" context :)
14:04 pmauduit we are using it with a 2-week retention for monitoring purposes
14:23 benilton joined #dataverse
14:27 benilton Hi guys, quick question: have you been able to implement unified authentication (shibboleth) with dataverse on your institutions? Any hints to share?
14:45 pdurbin donsizemore: right! I was thinking about your toolbox.
14:45 pdurbin benilton: hi! Yes, we have Shibboleth enabled for https://dataverse.harvard.edu
14:46 pdurbin pmauduit: would it be of interest if Dataverse could be integrated easily into a Prometheus installation? By that I mean we would document how to do it in the Dataverse Installation Guide.
14:47 benilton pdurbin: thanks! We, at UNICAMP, got 2 teams working on Shibboleth and it was a no-go for us... Just knowing it's possible should give me enough to ask them to look at this harder....
14:47 pdurbin donsizemore: also, are you ready for a diagram I just cooked up? :)
14:47 pdurbin benilton: it's possible. :) donsizemore also has shib running at https://dataverse.unc.edu
14:48 pdurbin shib is a pretty popular feature of Dataverse. I forget who else here uses it.
14:49 pdurbin I haven't heard of UNICAMP. Is it https://www.unicamp.br ?
14:50 benilton Yes, that's us (www.unicamp.br) ...
14:51 pdurbin Great! Welcome!
14:52 benilton :-) many thanks!
14:53 pdurbin benilton: we would love to get you on our map of installations some day: http://guides.dataverse.org/en/4.15.1/installation/config.html#putting-your-dataverse-installation-on-the-map-at-dataverse-org :)
14:55 benilton should we wait a bit until our dataverse is made available to our researchers? otherwise you'd be pointing to an empty installation
14:55 pdurbin Yeah, it's fine to wait. :)
14:56 pdurbin donsizemore: that reminds me, we have another installation to add to https://dataverse.org/metrics :)
14:56 andrewSC pdurbin: heard of it and one of the people i work with actually suggested we try it out for a project but nothing more
14:57 andrewSC fwiw archlinux uses grafana + zabbix which i think has some heavy overlap but i also think there was some discussion around if we should try and replace it with something else or not a while back..
14:57 pdurbin andrewSC: ok, there's a pull request by donsizemore here if you want to play with prometheus in the context of Dataverse: https://github.com/IQSS/dataverse-ansible/pull/96 . pmauduit: you might be interested in this as well.
14:57 benilton pdurbin: we have this internal requirement that anything that requires a username must use our university identities... so we must have shibboleth working to "go live"
14:57 donsizemore @pdurbin sure thing. hostname?
14:57 andrewSC mmmmmmmmm
14:57 andrewSC nice
14:58 pdurbin andrewSC: do you (or anyone here) have experience with grafana?
14:58 donsizemore @andrewSC all that PR does is stand up prometheus and do some standard OS monitoring... the JVM bits are phase 2
14:58 pdurbin benilton: that makes total sense. Auth is important. We're happy to help you get Shib working.
14:59 donsizemore @benilton are you willing to share the non-sensitive bits of your shib config via e-mail?
14:59 pdurbin donsizemore: even standard OS monitoring is a great start. I say merge it if it even half works. :)
14:59 andrewSC sorry for the radio silence guys.. Idk if I mentioned it before but there have been some serious changes in my life recently lol all good things for sure, but next week is my last week with NC State University
14:59 donsizemore @pdurbin i wanted you to review it =)
14:59 andrewSC Gonna be doing my own thing
14:59 pdurbin donsizemore: I did review it. :) I left you a comment. :)
14:59 donsizemore @andrewSC i'm happy to listen over david's dumpling, peking dumplings, or both
14:59 andrewSC contracts, consulting, etc
15:00 benilton I'm happy to share any info needed to make this work, guys... thanks a million!
15:00 donsizemore @andrewSC come to the blue side https://unc.peopleadmin.com/postings/166078
15:00 pdurbin andrewSC: oh! Are you going to appoint a successor in here? :) Good luck!
15:00 donsizemore @benilton if you'll e-mail your redacted shibboleth2.xml and dataverse-idp-metadata.xml to dls@email.unc.edu i'll be happy to take a look
15:00 andrewSC pdurbin: re: grafana, unfortunately not
15:01 andrewSC donsizemore: oh no joke! you're in durham?
15:01 donsizemore @andrewSC not THAT blue, you heathen.
15:01 pdurbin benilton: like donsizemore said, you can share non-sensitive stuff here (this channel is logged, as explained in the /topic). You can send sensitive stuff to support@dataverse.org and I can give donsizemore and others access to the ticket that gets created.
15:01 andrewSC ahhh hahah
15:01 benilton donsizemore: thanks a million! will work on this
15:01 andrewSC unc chapel hill!
15:01 benilton pdurbin: many thanks!
15:02 donsizemore @andrewSC Odum is a wonderful place to work. and we're hiring
15:02 andrewSC pdurbin: I've passed the gauntlet to a jr. dev we hired so he definitely knows about the channel and our past goings ons
15:02 pdurbin In Columbus there is only one Blue that we think about.
15:02 andrewSC i don't think he really likes irc though lol
15:02 pdurbin meh, who does? :)
15:03 andrewSC ;))
15:03 pdurbin andrewSC: are you still interested in https://github.com/IQSS/dataverse/issues/3565 ?
15:03 andrewSC donsizemore: lemme think on it
15:04 donsizemore @andrewSC UNC is on a different planet than NCSU, i've worked for both
15:04 andrewSC got some big, big deliveries next week for a couple teams + departure at NCSU so i'm super swamped right now
15:04 andrewSC yeah?
15:04 donsizemore must run, upgrading a "satellite" dataverse... back in a bit!
15:04 andrewSC kk
15:05 andrewSC pdurbin: i don't believe so
15:05 pdurbin benilton: so which would you prefer? Sharing details about your shib stuff here or in a support ticket?
15:06 donsizemore @benilton may e-mail me directly (if so desired) at dls@email.unc.edu
15:06 pmauduit pdurbin: no idea about the integration, which kind of metrics would you like to gather ?
15:06 pdurbin donsizemore: since you're still here... here's the installation to add to metrics: https://twitter.com/DalLibraries/status/1160182776148234240
15:07 pmauduit pdurbin: on other Java projects we make use of JMX to expose some specific metrics from our webapp
15:07 pdurbin pmauduit: ALL THE METRICS! At the moment we are tracking down a memory leak: https://github.com/IQSS/dataverse/issues/6035
15:07 pmauduit as it can be easily harvested by a collectd instance (with the jmx client plugin)
15:07 pmauduit oh ok
15:08 pdurbin pmauduit: speaking of JMX I'm in the middle of listening to https://www.javapubhouse.com/2019/08/episode-85-monitor-the-world-with-jmx.html and I'm getting inspired. :)
15:09 pdurbin I jotted down a couple JMX-related tools they mentioned: https://jolokia.org and https://github.com/statsd/statsd
15:10 pmauduit jolokia is about giving access to JMX via a webservice IIRC
15:10 pdurbin yes, JMX via REST
15:11 pmauduit but if you already have access to your jmx endpoint (inside your infrastructure you probably do), I cannot see the added value, maybe it's more convenient to curl to get the metrics
15:11 benilton donsizemore: I'll send you an email, ok?
15:13 pmauduit another thing, the applications we are monitoring are classic webapps running in jetty, I don't know if jetty provides the base metrics (the common ones like heap used, max, and so on) or if we have them by default just by configuring a jmx endpoint when launching the jvm
15:14 pdurbin pmauduit: right now (for that memory leak issue) we're using jstat but I want the metrics to be graphed in grafana or whatever.
15:14 pmauduit ok
15:15 pdurbin Instead of jstat->grafana I'm thinking JMX->Prometheus->Grafana. Does that make sense?
15:15 pdurbin (standup time, brb)
15:20 pmauduit pdurbin: that's we are doing in production, seems legit from my side
15:20 pmauduit pdurbin: here is a template used in our rancher envs for generating a collectd config: https://github.com/camptocamp/docker-jmx-collectd/blob/master/templates/10jmx.conf.tmpl
15:21 pmauduit it might be a bit "regular servlet webapps" / rancher / prometheus specific, but ...
15:28 jri_ joined #dataverse
15:37 pdurbin pmauduit: Thanks! But what's collectd and what does it have to do with Prometheus? I'll Google it.
15:39 pdurbin I see "write_prometheus" at https://github.com/collectd/collectd :)
15:39 pmauduit pdurbin: in our setup, the JVM exposes a JMX endpoint which is harvested by collectd which then dumps metrics to prometheus
15:40 pdurbin Sounds fine. Whatever works. :)
15:40 pdurbin Is that how you recommend we monitor Dataverse?
15:42 pmauduit it could be an option, I'll probably go for this if we had a dataverse in our envs, which is currently not the case (yet)
15:44 pdurbin That's good enough for me. Are you using Grafana?
15:44 pmauduit yes
15:44 pmauduit we are using the prometheus datasource on our monitoring dashboards
15:44 pdurbin awesome
15:45 pdurbin donsizemore: ^^ step 2 (adding graphs)
15:51 pdurbin pmauduit: once https://github.com/IQSS/dataverse-ansible/pull/96 gets merged, how hard to you think it would be to add a Grafana dashboard for operating system metrics like memory and CPU usage and free disk space?
16:06 pmauduit I don't know exactly how it'd be set up only by looking at the PR, I should have a closer look on what the repository does exactly, but once you have a prometheus + grafana correctly set up, it is not a big deal to play with grafana to create a dashboard
16:07 pmauduit I'd I think a good thing would be to first create the dashboard by hand in the grafana ui, then export it into json, and see if it can be automatically integrated by default for further usages
16:08 pdurbin pmauduit: ok, thanks. These days the way we are starting to use dataverse-ansible is to spin up Dataverse on AWS using the "ec2-create-instance.sh" script mentioned at https://github.com/IQSS/dataverse-sample-data
16:08 pdurbin pmauduit: that sounds like how I do jmeter stuff. Launch the GUI. Fiddle around. Export the config. Sounds fine. :)
16:08 pmauduit :)
16:09 pdurbin donsizemore: Danny started spinning up Dataverse on EC2 this week by following that README. Awesome stuff. :)
16:09 pmauduit funny, I did some ansible fiddling this afternoon (in an other topic, but ... :))
16:09 j-n-c left #dataverse
16:10 pdurbin Come to think of it, I do the same with Jenkins. Fiddle in the GUI, export the job as XML.
16:11 pmauduit I did the same, and used ansible also to persist my jenkins config :)
16:11 pdurbin :)
16:12 pmauduit (secrets + jobs / pipelines and so on)
16:12 pdurbin pmauduit: I think you should try out dataverse-ansible. All you need is a CentOS 7 server. I recently added a quickstart. I could even merge the prometheus stuff first if you want. :)
16:20 pmauduit it could give me a better insight on how to set it up, last time with the docker compo, I only kept a "birds eyes" view of the stack ;)
16:22 pdurbin pmauduit: I'm thinking that if I merge the prometheus pull request, maybe you can install grafana and make us a dashboard (just OS stuff is fine) and export the grafan config and share it with us.
16:28 pmauduit you don't need to merge now, I cloned the repo so I've the branch used for the PR
16:29 pmauduit I'm done for today, I can try to get some time tomorrow, but if I can use vagrant to provision, shall I end up with a functional grafana ?
16:30 pdurbin pmauduit: no, you'll if you're on that prometheus branch you should get prometheus. You'll need to install Grafana yourself within your Vagrant VM.
16:30 pmauduit ok
16:30 pdurbin And I have no idea if it's easy to install Grafana on CentOS 7.
16:31 pmauduit it seems packaged: https://grafana.com/docs/installation/rpm/
16:31 pdurbin nice
16:32 pmauduit there also is a yum repo, so I'm pretty sure it should not be a big deal to have it in somewhere in the playbook
16:32 pdurbin I doubt it. We already install other stuff from non base yum repos.
16:32 pdurbin such as EPEL
16:55 jri joined #dataverse
17:54 donsizemore @pmauduit (just back from thai food) — dataverse-ansible uses the EPEL and Postgres repos
17:55 pdurbin oh right, postgres
17:55 donsizemore all my hand-wringing over using LTS postgres-9.2 dashed!
17:55 pdurbin heh
17:55 pdurbin sorry man
17:56 pdurbin that's a bummer man
17:56 donsizemore (i'm being silly)
17:56 donsizemore @pdurbin gustavo said today at 3, if that still works for you?
17:57 pdurbin yep!
18:04 donsizemore @andrewSC is now when i tell you that i commuted from SE Cary to UNC for more than a decade?
18:11 andrewSC donsizemore: reallyyy
18:16 andrewSC just read over that job posting, I'm gonna go out on a limb and say it can't be done full-time remotely?
18:17 andrewSC part of the reason I'm leaving NCSU is because i've moved to Charlotte lol
18:17 donsizemore poop. we're really flexible about that, but University policy requires one physical workday a week
18:17 andrewSC mmmmmmmmmmmm
18:17 andrewSC :/
18:17 donsizemore thought that counts
18:17 andrewSC mhmm
18:18 andrewSC I'm actually contracting with a team at RTI right now fully remote
18:18 andrewSC been enjoying that quite a bit even though it's Azure lol
18:23 donsizemore you're going to look forward to the squirrels climbing up on the window sill
18:28 andrewSC hahah
18:29 andrewSC my neighbor has them real bad... but she friggin feeds them corn to deter them from the bird feed lmao
18:29 andrewSC doesn't really work like that
18:30 donsizemore a former boss of mine put out salt licks for the deer, which enticed them to wreak havoc on his wife's flowers
18:30 donsizemore @pdurbin https://jenkins.dataverse.org/job/IQSS-Dataverse-Develop-testSubset/3/console
18:31 donsizemore @pdurbin that's with only mvn test -Dtest=DataversesIT
18:33 jri joined #dataverse
18:40 donsizemore @pdurbin oh, speaking of your dataverse map, how does a separate instance run by a member org rate? https://dataverse.theacss.org/ in particular
18:50 pdurbin donsizemore: they should go on the map, for sure
18:50 pdurbin so BUILD FAILURE even with one IT test? :(
18:50 pdurbin T-T
18:51 pdurbin I wonder what the smallest IT test is.
18:51 pdurbin I guess we could do a `wc` on them.
19:01 donsizemore 23 SiteMapIT.java
19:01 donsizemore vs     2145 UtilIT.java
19:01 donsizemore it makes a rather nice poisson distribution, actually
19:02 pdurbin UtilIT is all helper functions, I don't think it has any real tests in it
19:07 donsizemore i'm trying with SiteMap (UtilIT was the longest, by line)
19:07 pdurbin cool
19:13 donsizemore https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ws/conf/docker-aio/server.log
19:16 donsizemore Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
19:17 pdurbin from phoenix: model name: Intel(R) Xeon(R) CPU E5-4620 v2 @ 2.60GHz
19:19 pdurbin https://github.com/IQSS/dataverse/tree/develop/scripts/deploy/phoenix.dataverse.org
19:23 donsizemore ./c7.dockerfile:ARG DoiProvider=FAKE
19:25 pdurbin Unable to publish dataset: edu.harvard.iq.dataverse.engine.co​mmand.exception.CommandException: This dataset may not be published due to an error when contacting the <a href=http://status.datacite.org target="_blank"/> DataCite </a> Service
19:33 pdurbin InReviewWorkflowIT
19:34 pdurbin I'd be curious if InReviewWorkflowIT by itself passes, since it tries to publish a dataset.
19:52 pdurbin donsizemore stefankasberger: if I just wrote a little script that uses pyDataverse, where should I put it? As an example script in https://github.com/AUSSDA/pyDataverse ? In https://github.com/OdumInstitute/dataverse-toolbox ?
20:01 Kamil75 joined #dataverse
20:02 Kamil75 Hello dataverse community
20:04 Kamil75 Could you tell me what range of WCAG 2.0 guidelines are covered by Dataverse web system?
20:05 Kamil75 https://en.wikipedia.org/wiki/Web_Content_Accessibility_Guidelines
20:21 pdurbin Kamil75: hi! Please take a look at https://github.com/IQSS/dataverse/issues/6072
20:38 jri joined #dataverse
21:46 jri joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.