IQSS logo

IRC log for #dataverse, 2019-03-18

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
08:03 poikilotherm joined #dataverse
10:14 juancorr joined #dataverse
10:31 pdurbin joined #dataverse
10:32 pdurbin good morning, all
10:55 poikilotherm Good morning :-)
10:57 * pdurbin is having pancakes instead of waffles
10:57 poikilotherm LOL
11:00 pdurbin welcome back, poikilotherm
11:00 poikilotherm :-)
11:00 poikilotherm Thx
11:03 pdurbin lots in flight: waffle, hackathon, EKS, postgres upgrades
11:04 poikilotherm Yeah :-D
11:04 pdurbin What's on your mind?
11:05 poikilotherm Trying to sort things first :-D
11:05 poikilotherm Lots of channels flooding me with inpu
11:06 pdurbin sure, I can imagine :)
11:06 poikilotherm Reading and commenting on GH issues now...
11:07 pdurbin Oh good, you noticed the new unit test team. :)
11:08 pdurbin They introduced themselves here: https://github.com/IQSS/dataverse/issues/5619
11:09 poikilotherm YEah
11:09 poikilotherm And they are still learning Junit4 at university :'-(
11:10 poikilotherm Sometimes it is really sad that good things take much time
11:12 pdurbin yeah
11:17 pdurbin Is the version of postgres in dataverse-kubernetes new enough for flyway?
11:18 poikilotherm Sure
11:18 poikilotherm 9.6
11:19 pdurbin oh good
11:20 pdurbin Where is that defined?
11:22 poikilotherm https://github.com/IQSS/dataverse-kubernetes/blob/655b92d60b1a09e109154c9c237525827dbf4dff/k8s/postgresql.yaml#L21
11:23 pdurbin ah, thanks
11:29 poikilotherm About your mail/issue on loading times
11:29 poikilotherm https://filippobuletto.github.io/solid-java/?utm_content=85227230&utm_medium=social&utm_source=twitter&hss_channel=tw-2599580401#
11:30 pdurbin "Create a set of well designed and written classes so you can speed up the coding process!"
11:32 poikilotherm :-D
11:32 poikilotherm Yeah, that is obviously already done
11:32 poikilotherm I really liked the ideas and they might be inspirational
11:33 pdurbin can't hurt
11:35 pdurbin We talked about the slow feedback loop during sprint planning last week. We didn't estimate that issue though. We decided to talk about it during tech hours and try to come up with a list of questions we have.
11:36 pdurbin One of my questions is, "How fast can the feedback loop be when deploying a war file to Glassfish? If the war file is tiny, for example."
11:37 poikilotherm I don't think that is what really hurts us
11:37 poikilotherm A solid profiling would be IMHO the place to start
11:39 poikilotherm Oh and did you see https://info.payara.fish/hubfs/Payara-Server-4-to-Payara-Server-5-Upgrade-Guide.pdf ?
11:40 pdurbin You don't think the size of the war file matters?
11:40 poikilotherm Must certainly it does!
11:41 poikilotherm But it seems to have a minor role in it
11:41 pdurbin What leads you to that conclusion?
11:42 poikilotherm https://github.com/IQSS/dataverse/issues/5593#issuecomment-468853270 Nr 3 and 4. Plus what @pameyer says below.
11:42 poikilotherm Oh and No. 2
11:43 poikilotherm Actually most of my comment :D
11:47 pdurbin When you tried to strip down the size of the war, how small did you make it? Roughly.
11:47 poikilotherm Hmm didn't we chat about that?
11:47 poikilotherm I think it was in the order of 10-20MB
11:48 poikilotherm https://www.ej-technologies.com/buy/jprofiler/openSource/enter
11:48 poikilotherm Maybe this could be worth a try?
11:48 pdurbin And where did you put all the dependencies? In some "lib" directory for Glassfish?
11:49 poikilotherm Gimme a sec, I'll see if I can give you a commit
11:50 pdurbin (I don't believe I've seen this Payara migration guide, by the way.)
11:51 poikilotherm Ok, here we go
11:51 poikilotherm In https://github.com/poikilotherm/dataverse/blob/3c0448b073d624448f57293a0885498fe9079e0a/pom.xml#L727 you see me stripping out the stuff
11:51 poikilotherm BEWARE
11:51 poikilotherm THIS DOES NOT WORK
11:51 poikilotherm As stated in my comment the issue 5593, there are libs that need to be present in the WAR
11:52 poikilotherm So a simple "just strip it all out" will NOT work
11:52 poikilotherm Over at https://github.com/poikilotherm/dataverse/blob/3c0448b073d624448f57293a0885498fe9079e0a/pom.xml#L764
11:52 poikilotherm You see me copying the deps jars to the lib dir
11:53 poikilotherm Those are copied from the working dir to the right place during the docker build https://github.com/poikilotherm/dataverse/blob/3c0448b073d624448f57293a0885498fe9079e0a/conf/docker/app/Dockerfile#L8
11:54 poikilotherm Which is [...]/glassfish/domains/domain1/lib/
11:54 poikilotherm (more or less, see https://github.com/poikilotherm/dataverse/blob/3c0448b073d624448f57293a0885498fe9079e0a/conf/docker/app/Dockerfile#L6)
12:00 pdurbin cool, that you're automating this with maven, excluding jars, copying them to "lib"
12:05 poikilotherm That was easy... But as I said: that won't work, because things like Omnifaces etc need to live inside the WEB-INF/lib folder of the WAR :-(
12:06 pdurbin :(
12:07 poikilotherm What do you think about my profiling suggestion?
12:08 pdurbin I've never used jprofiler. I use the one built into netbeans (rarely).
12:08 pdurbin visualvm, I think
12:10 pdurbin https://visualvm.github.io
12:10 poikilotherm Yeah. From what I read, VisualVM seemed to be pretty limited in using with server applications
12:18 pdurbin Oh. Bummer.
12:20 pdurbin poikilotherm: oh, I still need to reply to Giacomo. I hope you don't mind that I added you to the "people" spreadsheet linked from this "2020 Dataverse Hackathon in Europe README" doc: https://docs.google.com/document/d/1pbr0zsqMsl6NIAJCULUTTchMOsch2CAJX14_qQ2zFPE/edit?usp=sharing
12:20 poikilotherm That's fine :-)
12:21 poikilotherm Thx for setting things up
12:21 pdurbin Sure. What would be a sane way for the organizers to communicate?
12:24 poikilotherm Maybe a Github issue?
12:24 poikilotherm I liked the idea...
12:25 pdurbin might be a long issue by the end
12:42 poikilotherm Hmm. Maybe create a separate project?
12:42 poikilotherm Like github.com/IQSS/dataverse-hackathon-2020
12:42 poikilotherm Then it would be easy to create multiple issues and keep things organized
12:44 poikilotherm By the way... It would be really fancy to let Dataverse communicate its internal measurements into some time series database
12:44 poikilotherm https://blog.payara.fish/consuming-microprofile-metrics-with-prometheus?utm_content=85423927&utm_medium=social&utm_source=twitter&hss_channel=tw-2599580401
12:44 poikilotherm E.g. for showing people the growth over time etc
12:45 poikilotherm Plus the monitoring caps for daily sysadmin usage
12:58 poikilotherm Hey pdurbin, I have some people from Stuttgart contacting me about S3 storage for Dataverse
12:58 poikilotherm Did they try to reach you guys, too?
13:07 donsizemore joined #dataverse
13:11 pdurbin_m joined #dataverse
13:12 pdurbin_m poikilotherm: Stuttgart? I don't think so.
13:13 pdurbin_m Yes to Prometheus. Can we make Dataverse into a MicroProfile app?
13:13 poikilotherm I just responded to them and told 'em to come here and talk with us... :-D
13:13 poikilotherm And pointed them to https://github.com/IQSS/dataverse/issues/4439
13:14 pdurbin_m Yes, eventually a separate repo for the hackathon makes sense. I've made repos for JavaOne talks under IQSS.
13:15 pdurbin_m Cool. Thanks. Happy to talk whenever.
13:15 poikilotherm About prometheus: this is not restricted to Prometheus. AFAIK all that is needed to use MicroProfile APIs is an app server supporting it
13:15 poikilotherm Either in Full or Micro profile
13:15 poikilotherm Maybe switch to sth. like micro profile might be good for loading times, too. But this needs investigation
13:16 poikilotherm Alright gotta go. My concrete base plate is getting done today...
13:16 poikilotherm Cu tmrw
14:17 pameyer joined #dataverse
14:26 pameyer re: profilers - according to my notes, hprof worked but wasn't particularly user friendly
14:28 pdurbin in my notes I have this: jmap -dump:format=b,file=mydump.hprof $glassfish_pid
14:32 pameyer it's _probably_ possible to hook up a profiler to the same gf debug port that the debugger users; but I haven't dug into that
14:33 pdurbin netbeans doesn't seem to differentiate much between the profiler and the debugger
14:34 pdurbin I think for both you have to start glassfish in a special mode.
14:34 pdurbin maybe they're different modes. it's been a while
14:38 pameyer `bin/asadmin start-domain --debug`
14:39 pameyer port 9009 by default
14:39 pdurbin sounds right
14:48 poikilotherm joined #dataverse
14:49 poikilotherm Hi again :-)
14:49 pdurbin welcome back
14:49 poikilotherm :-)
14:49 * poikilotherm brushes of some concrete dust and gets back to work
14:50 pdurbin anything I should bring up in standup in half an hour?
14:56 poikilotherm You could add a note that we are actually working on our pre-prod system
14:56 poikilotherm We have it deployed, but need to do configuration and customization
14:57 pdurbin Good. Doesn't sound like you're blocked.
14:58 poikilotherm Currently not :-)
14:58 poikilotherm Dev work happens in parallel
15:00 pdurbin nice
15:11 donsizemore joined #dataverse
15:16 pameyer @donsizemore thanks for the investigation on 5659.  turned out to be my fault
15:17 donsizemore @pameyer i was coming here to ask about 30-second timeouts. glad you found the problem!
15:18 pameyer @donsizemore - 30-second timeouts somewhere else?
15:19 donsizemore @pameyer no, i was thinking 2x30 second (say DNS) timeouts would describe the lag you were seeing
15:20 pameyer @donsizemore good point; but fortunately not this time
15:46 drew-jhu joined #dataverse
15:49 pdurbin drew-jhu: did you hear us talking about you a couple minutes ago? :)
15:49 pdurbin and good morning :)
15:51 drew-jhu uh, no? & hi!
15:51 drew-jhu good things, i hope
15:54 pdurbin all good things :)
15:55 pdurbin I talked about JHU's software metadata blocks at Open Science Days the other week. Slides at http://osd.mpdl.mpg.de/?page_id=1250
15:56 drew-jhu well, that's a relief. ok. cool!
15:57 pdurbin also, Julian might get in touch with you about those blocks
15:57 pdurbin I just forwarded your email to him.
15:58 drew-jhu ok. feel free to pass my contact info along
15:58 drew-jhu i actually stopped by to ask about Flyway
15:59 drew-jhu it looks like things are coming along nicely with that integration
16:01 drew-jhu i was wondering when it might be ready for testing & when it might be officially released (if you're in the mood for some prognosticating)
16:01 pdurbin yes, the flyway pr has been merged
16:02 pdurbin it actually broke the phoenix server since it's still on postgres 8.4
16:02 pdurbin https://github.com/IQSS/dataverse/issues/5649
16:02 pdurbin https://groups.google.com/d/msg/dataverse-dev/CTRpKg0xP2o/SKWHedmvBQAJ
16:03 drew-jhu fortunately, i'm on dataverse 4.9.4 and postgres 9.3
16:03 pdurbin phew
16:03 pameyer I've got a few that are still on postgres 9.3 that I'm pretty sure I'll need to update
16:03 drew-jhu to 9.6?
16:03 pameyer I'd thought that flyway wanted 9.6
16:04 pameyer I might be wrong though - I've been focusing on other stuff, so it may not be a required update
16:04 pdurbin "supported versions": https://flywaydb.org/documentation/database/postgresql
16:04 pameyer updates you don't have to do (yet) are the easiest updates :)
16:05 pameyer pdurbin: thanks
16:05 drew-jhu i've got automation to upgrade postgres from 9.3 to 9.6, so that should be no problem (knock wood)
16:06 pdurbin drew-jhu: you're welcome to try out "develop" now if you want to play with flyway or read what I wrote.
16:06 drew-jhu are the flyway bits in place to move up from 4.9.4 to develop?
16:06 pdurbin ah, no. there's an alternative experimental path mentioned in the release notes
16:07 drew-jhu oh. ok. i hadn't found that yet. help me out with a link?
16:09 drew-jhu ah, you mean "a note on upgrading from earlier versions" here: https://github.com/IQSS/dataverse/releases
16:09 pdurbin yeah, "We now offer an EXPERIMENTAL database upgrade method allowing users to skip over a number of releases"
16:11 drew-jhu care to hazard a guess on how soon this might move from "experimental" to "endorsed for production"?
16:11 pdurbin no idea
16:11 pdurbin sorry
16:12 pdurbin I think you should try it. On a staging server.
16:12 drew-jhu ok. i was sure you were going to say something like, "the sooner people like YOU try it out & give us feedback & PRs, the sooner it'll make it to release"
16:13 pdurbin I mean, it went through QA. It was tested. But I'm not sure if it has been used on production installations yet. You could ask on the mailing list.
16:15 drew-jhu ok. that's good info. my time is being tightly managed at the moment with some competing priorities, but if i'm able, i'll give it a try
16:16 pdurbin Cool. You can also just do the old "deploy each war file" approach.
16:18 drew-jhu y. that's the lower-investment-for-short-term-benefit path. just wanted to take a peek at what the longer-term-investment path looked like before making a recommendation
16:18 drew-jhu this gives me the info we need to figure out our next move. thanks!
16:19 pdurbin sure
16:29 julian-gautier joined #dataverse
16:29 isullivan joined #dataverse
16:34 isullivan joined #dataverse
16:41 pameyer pdurbin: no jenkins until 5649 is sorted, right?
16:41 pameyer [ERROR] Failures:
16:41 pameyer [ERROR]   MakeDataCountApiIT.testMakeDataCountGetMetric:60 Expected status code <200> doesn't match actual status code <400>.
16:41 pameyer [INFO]
16:41 pameyer [ERROR] Tests run: 85, Failures: 1, Errors: 0, Skipped: 2
16:41 pameyer will need to double-check to make sure it's not test setup fiddliness though
16:42 pameyer develop-943931363
16:48 pdurbin no phoenix anyway
16:49 pameyer ah, right
16:49 pdurbin this is what I wrote in the issue: "We need to either upgrade PostgreSQL on phoenix to 9.6 (used in production at Harvard Dataverse) or install a fresh server (perhaps with CentOS 7 instead of CentOS 6) to take the place of dvnweb-vm6.hmdc.harvard.edu."
16:51 pameyer well, if mdc's really busted; that might help push getting that sorted before another release
16:51 pdurbin my guess is that we will simply add an @Ignore to that test
16:52 pdurbin huh, actually...
16:52 pdurbin that one should work, I would think
16:53 pdurbin pameyer: can you please create an issue?
16:53 pameyer possibilities of docker-aio being fiddly; or missing some setup that mdc needs haven't yet been excluded
16:53 pameyer pdurbin: will do as soon as I get 2x replication of the failure
16:53 pdurbin awesome, sounds more scientific. thanks!
16:54 pameyer last time I didn't replicate it, I reported failing ITs for a community dev PR that was failing because docker-aio was fiddly :(
16:54 pdurbin It's still fiddly, right? I didn't "just work" last time I tried it.
16:54 pdurbin It* didn't
16:55 pameyer prep_it still is.  but `docker run` -> `docker exec` -> `setupIT` is still mostly working
16:55 pameyer not 100% sure what the glitch on that community dev PR was; but leaning towards not using the FAKE provider in setupIT
16:56 pdurbin ok, I was going to bring it up last week during estimation wednesdays but you weren't there so so brought up something else
16:56 pdurbin this is what I brought up instead: https://github.com/IQSS/dataverse/issues/5583
16:57 pameyer thanks - sorry I missed last week
16:58 pdurbin no worries
17:00 pameyer I'll ping dbrooke about 5662
17:01 pdurbin thanks
17:07 julian-gautier left #dataverse
17:12 donsizemore joined #dataverse
17:13 donsizemore @pdurbin pester question?
17:13 pdurbin hit me
17:16 donsizemore when you say "Users of the Institutional Log In option are not required to verify their email address because the institution providing the email address is trusted." but there isn't a boolean in the database... institutional users are automatically confirmed by dataverse?
17:17 donsizemore i would like to tell this to thu-mai and mandy, because shibboleth institution limitation would be nice to affirm to them
17:17 pameyer there's a timestamp in the database, which is null if it's not confirmed
17:17 pameyer for a shib user, it always gets counted as confirmed
17:17 pdurbin pameyer has is right
17:18 pdurbin it*
17:18 donsizemore ah. cool. well all of ours are null =)
17:18 pdurbin donsizemore: even for shib users?
17:18 * pameyer was lucky
17:18 donsizemore yes, this is why i asked
17:18 pdurbin huh
17:18 donsizemore but if they're assumed confirmed (and we have the functionality) i can run with that
17:19 pameyer I could see it being somehow migration related; but I'm not sure how - and you probably have recent-ish shib users
17:20 pameyer donsizemore: you probably checked authenticateduser.emailconfirmed , right?
17:20 pdurbin if the timestamp is null they aren't confirmed... sounds like you found a bug
17:20 donsizemore i do see 251 users who have confirmed and at least one of them are shib (unc.edu). but what about us extant old farts?
17:20 donsizemore i'll open an issue if you'd like?
17:20 pdurbin please
17:20 pameyer it sounds worth an issue to me.  from what I was seeing, it was looking like that gets set on shib account creation
17:21 donsizemore select count(id) from authenticateduser where email like '%unc.edu%' and emailconfirmed is null;  count  -------    247
17:21 pameyer :(
17:21 pdurbin pameyer: what are you seeing? non null time stamp for your shib users?
17:21 pameyer pdurbin: yeah; things like `pameyer+shibtest | 2019-03-18 11:06:15.207`
17:21 donsizemore that's what i was wondering. this is a feature thu-mai and mandy really want, so if i can set a dummy timestamp for at least unc.edu users that would be pretty sweet
17:22 pdurbin ok, so no bug on pameyer 's server. weird
17:23 pdurbin donsizemore: you could pile on to Amber's comment at https://github.com/IQSS/dataverse/issues/3300#issuecomment-462402097
17:23 donsizemore should my question go to support@ since it's more of a backfill recommendation?
17:23 pameyer pdurbin: and I see null for built-in users who haven't done email confirmation
17:23 pdurbin donsizemore: you should make a pull request to implement the feature. That really gets out attention. ;)
17:23 pdurbin our*
17:24 pameyer I'm wondering if the display logic says "shib, don't need to let the user verify their email"
17:24 pameyer but whatever checks if it's been verified may just check null/not null
17:24 donsizemore @pameyer that's what i was getting at
17:25 pameyer I don't actually know the full behavior of email confirmed / email not confirmed.  so I'm not sure what functionality changes
17:26 pameyer I know I get email for test users without email confirmation; but those are usually short-lived test users.  I could see allowing email for a time window before verification, then blocking it
17:26 pameyer @donsizemore do you know if there are users with a null that are still getting emails from dataverse?
17:27 donsizemore @pameyer there's still that bug which prevents shibboleth users from receiving a link in their confirmation email
17:27 donsizemore @pameyer thumai, mandy, me, we're all non-confirmed and functioning just fine. but it sounds like we're assumed to be confirmed and we're not. i can tack onto amber's issue
17:28 pdurbin I think that would be best. Amber has a good use case.
17:28 pameyer @donsizemore I'd missed the bug about the non-functional link
17:30 donsizemore @pameyer https://github.com/IQSS/dataverse/issues/3407
17:31 donsizemore (and this is still the case as of 4.9.4)
17:31 pameyer @donsizemore thanks
17:31 pdurbin yeah, that was me closing issues I had opened that hadn't gotten any attention in years
17:32 pameyer ... and a relatively quick grep at the code makes me wonder if there is currently any change in behavior if email is confirmed or not
17:32 pdurbin I'm trying to let other people open issues these days. Unless I feel like we'll actually fix it soonish.
17:35 pameyer @donsizemore from the description of 3300, it looks like I was inferring slightly incorrectly.  that timestamp should be set any successful login, not just account creation
17:36 donsizemore @pameyer i no get set
17:37 pdurbin if (ShibAuthenticationProvider.PROVIDER_ID
17:37 pdurbin authenticatedUser.setEmailCo​nfirmed(emailConfirmedNow);
17:38 pdurbin https://github.com/IQSS/dataverse/blob/v4.11/src/main/java/edu/harvard/iq/dataverse/authorization/AuthenticationServiceBean.java#L564
17:38 pdurbin the logic hasn't been removed...
17:38 pdurbin not sure what's going on for donsizemore
17:41 donsizemore @pdurbin i just signed in to my 4.11 test server with my shib account and my emailconfirmed field is still null =(
17:42 pdurbin :(
17:51 pameyer .... me trusting comments rather than code :(
17:55 pameyer oddness with PROVIDER_ID somehow?
17:56 pdurbin ¯\_(ツ)_/¯
17:59 pameyer pdurbin: that logic looks like create logic to me
18:00 pdurbin ah, true
18:03 pameyer @donsizemore I'm apparently spinning in circles (my thinking my initial impression was inaccurate was inaccurate).
18:05 pameyer @donsizemore do you have a ballpark estimate of if those users predate dataverse 4.5?
18:05 pdurbin looks like the confirmemail timestamp is only set on create
18:05 pdurbin good catch, pameyer
18:06 pameyer looking to me like that field wasn't being populated before 4.5.1 , and that there wasn't a default / backfill for shib users then
18:07 pameyer at least, from upgrade_v4.5_to_v4.5.1.sql
18:16 donsizemore @pameyer we all predate 4.5
18:20 pameyer @donsizemore that's consistent
18:27 pameyer @donsizemore from the phrasing in 5663, it's not clear if you're looking to have old shib users listed as verified; or to have shib users need a verify step (aka - don't assume shib users are verified)
18:27 pameyer I'm guessing it's the first one; but guesses can be inaccurate
18:27 pameyer ... especially on monday
18:35 donsizemore @pameyer the end game is to verify shib users who slipped through the cracks. i'll defer to you all on how to best accomplish that, but it seems like an automated verification triggered on login might be preferable?
18:37 pameyer @donsizemore by "automatic verification", you mean the same timestamp that new shib users get - right?
18:39 pdurbin Do we want Shib users to be treated differently than other users? Should we simply make all users verify their email? I'm thinking that might be best.
18:43 pameyer that seems like a bigger question.  if shib's trusted to auth users, and that they've got their emails correct; that's less the end users need to do.
18:44 pameyer but it depends on how verified email gets used.  which I still don't know, but my best current guess would be a verified icon vs unverified icon (excluding future plans)
19:08 pdurbin In this "2015-04-30 Dataverse User Accounts and Auth Meeting" doc you can see how Merce and I were thinking that we should treat users equally: https://docs.google.com/document/d/157sw9gaFGwrb0EtCGGlLMxSg0cooHfKEvIsjcebPcA8/edit?usp=sharing
19:09 donsizemore joined #dataverse
19:19 pdurbin donsizemore: ^^
21:23 donsizemore joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.