IQSS logo

IRC log for #dataverse, 2019-11-13

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
08:02 jri joined #dataverse
08:46 juancorr joined #dataverse
09:26 MrK joined #dataverse
09:45 sgerle1 joined #dataverse
10:15 nils`` joined #dataverse
10:56 poikilotherm joined #dataverse
11:35 poikilotherm Good morning everyone
11:42 pdurbin mornin'
12:40 juancorr joined #dataverse
12:48 MrK hi
12:51 poikilotherm Hey MrK :-)
13:04 donsizemore joined #dataverse
13:12 pdurbin Anybody blocked? Anybody need anything?
13:18 poikilotherm pdurbin: feel like doing a code review?
13:19 pdurbin sure!
13:19 poikilotherm Abstracted OAuth2LoginBackingBean and added loads of unit tests
13:20 poikilotherm Need to finish the last one for the existing user branch...
13:20 poikilotherm PR forthcoming...
13:20 poikilotherm ~15 minutes
13:20 pdurbin Cool. Is there an issue?
13:20 poikilotherm Well there is #5991 and #5974
13:20 poikilotherm But I might create a tracking issue
13:20 poikilotherm So we depict the smaller steps
13:21 pdurbin yeah, a new small issue would be great
13:24 pdurbin sounds like code coverage will go up :)
13:28 poikilotherm Oh yeah
13:28 poikilotherm I bet it will be ~0.2%
13:29 pdurbin nice
13:29 pdurbin any features or bug fixes in your branch?
13:29 poikilotherm Not yet.
13:29 poikilotherm This is just preparing the introduction of OIDC
13:30 poikilotherm As I want to reuse the existing infrastructure
13:30 pdurbin Sure. How much work to add OIDC?
13:31 poikilotherm Not much as far as I can see.
13:31 poikilotherm Abstracting was the hard part
13:31 poikilotherm I needed to get rid of the direct dependency on ScribeJava in the controller
13:32 poikilotherm But while being in there, I wanted to increase coverage ;-)
13:32 pdurbin Great. Sounds like you should go ahead and add OIDC before we sent it to kcondon. But I'm happy to look at just the refactoring.
13:32 poikilotherm Well you prefer smaller parts, don't you?
13:33 poikilotherm And you get increased coverage :-D
13:34 donsizemore @pdurbin I need OCR recommendations
13:35 poikilotherm python + tesseract
13:36 poikilotherm https://pypi.org/project/pytesseract/
13:37 poikilotherm Done some things with that, went pretty smooth :-)
13:37 donsizemore @poikilotherm i was looking at textract; my boss is looking at SimpleIndex (commercial)
13:37 pdurbin donsizemore: did you see we're doing a lot of OCR these days? https://groups.google.com/d/msg/dataverse-community/RbpmnLm8nv4/HWU1ltVNCAAJ
13:38 donsizemore we've got 100,000+ court documents with hand-written annotations, and want it all
13:38 pdurbin poikilotherm: yes, please make the code review easy
13:38 poikilotherm Oh, hand-written. I dunno if tesseract is good at these...
13:39 pdurbin donsizemore: sounds a bit like https://case.law
13:39 donsizemore @poikilotherm i know... and it's all over the place. i want minions
13:39 poikilotherm I remember reading an article about someone using an neural network to digitise handwritten preaches from the 19th century
13:40 poikilotherm He analysed the text afterwards about text style etc etc
13:40 donsizemore one shouldn't scrutinize the prophets
13:42 poikilotherm I can't find the article, but I'm pretty sure it used https://transkribus.eu/Transkribus/
13:45 poikilotherm Got it... https://infoditex.hypotheses.org/192
13:45 poikilotherm I haven't tried this myself, but it looks really powerfull
13:49 poikilotherm pdurbin: I created a small issue https://github.com/IQSS/dataverse/issues/6364
13:56 poikilotherm pdurbin: I just request a review from you on https://github.com/IQSS/dataverse/pull/6365
14:02 pdurbin looks good so far, I think
14:04 poikilotherm How do you like the unit test?
14:04 poikilotherm It's Junit 5 heavily using Mockito
14:05 poikilotherm First time I used Spy and verify()
14:05 pdurbin they look great
14:06 pdurbin from here should you branch and start adding OIDC?
14:06 poikilotherm Aye
14:06 pdurbin perfect
14:07 poikilotherm My refactoring already gave me an idea for a screencast: take a controller not yet unit tested and do coding ;-)
14:08 poikilotherm Kind of a "hacky session" ;-)
14:10 poikilotherm pdurbin: looks like @kcondon doesn't have much in the queue... should we move it to QA?
14:45 pdurbin I dunno. I think we might want to add some value for users first.
14:46 poikilotherm Sure. But PRs will tend to get larger that way.
14:46 poikilotherm Whatever works for you guys
14:47 poikilotherm For me this does make little to no difference
14:47 poikilotherm Just poking with a stick, trying to figure what's a good size and scope
14:47 poikilotherm Please guide me
14:54 pdurbin I'm thinking a new branch based on what you've got but with OIDC support.
14:54 pdurbin Does that make sense?
14:55 pdurbin xarthisius: have you heard of https://researchobject.github.io/ro-crate/ ? I'm about to join a call out it.
14:58 poikilotherm pdurbin: should I close the PR? A new branch will have these commits, too when I open a new PR. Dunno if that makes it tedious/chaotic, but happy to do so.
14:58 pdurbin I'd leave it open. Pull requests are free. Branches are free. :)
14:59 poikilotherm Sure. Whatever works best for you guy
14:59 poikilotherm +s
15:07 pdurbin poikilotherm: how much UI would you need to implement to make it testable?
15:07 poikilotherm For OIDC?
15:07 pdurbin yeah
15:07 poikilotherm Nada as far as I can see.
15:07 poikilotherm I am really trying to reuse all of the current infrastructure
15:07 pdurbin Oh! Great! Would there be an OAuth-style JSON file?
15:08 poikilotherm Yeah, that looks like it's going to be inevitable
15:08 pdurbin ok
15:08 poikilotherm But it will be just an extension of what we already have
15:08 pdurbin sounds fine
15:11 pdurbin donsizemore: my officemate just arrived. I'll ask him about OCR after I get off this call.
15:11 donsizemore @pdurbin ^^ handwritten OCR! (and thank you)
15:14 poikilotherm pdurbin donsizemore have you guys seen all the failing unit tests in develop?
15:14 poikilotherm I noticed that because they fail in my branch, too...
15:15 donsizemore i blame cold weather.
15:16 poikilotherm Ok, false alert. After mvn clean, everything back to normal.
15:16 poikilotherm Guess I need to take a look...
15:20 pdurbin poikilotherm: it looks like your pull request is red: https://travis-ci.org/IQSS/dataverse/pull_requests
15:20 pdurbin this one: https://travis-ci.org/IQSS/dataverse/builds/611380788
15:21 poikilotherm Yeah.
15:21 poikilotherm T_T
15:21 pdurbin :)
15:21 poikilotherm Lot's of NPE
15:21 poikilotherm Class not found
15:23 pdurbin it happens
15:31 poikilotherm Gnarf. Looks like the locale handling went away
15:54 pdurbin donsizemore: I hadn't done a build on https://build.hmdc.harvard.edu:8443/job/phoenix.dataverse.org-apitest-develop/ for almost two weeks. It's passing. But this started failing yesterday: https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ . I haven't dug in yet.
17:48 donsizemore @pdurbin the first one's an easy fix (including possibly keeping jenkins' group_vars file someplace outside of IQSS/dataverse-develop)
18:58 pdurbin donsizemore: yeah, https://github.com/IQSS/dataverse/pull/6367 is the easy fix
18:58 donsizemore @pdurbin i'm thinking we'll use default group_vars and just sed apitestsuite to true or something
18:59 donsizemore @pdurbin or, i've been keeping all the tests set to 'true' in tests/group_vars/default.yml so just tell the jenkins job to use that one instead
19:00 pdurbin I'm fine with whatever. I probably should have thought twice before merging that pull request without testing it. :) Maybe we can chat more about this at 3.
19:02 donsizemore yis
19:02 pdurbin We'd like to get 4.18 out soon.
19:03 pdurbin all the features and bug fixes we want are in
19:04 pdurbin donsizemore: can you easily tell what the failure is in https://jenkins.dataverse.org/job/IQSS-dataverse-develop/250/consoleFull ?
19:16 donsizemore @pdurbin i made a note to myself to tell the log-copy option to grab mvn.out as well
19:17 pdurbin donsizemore: cool. I was just thinking the same thing. :)
19:17 donsizemore @pdurbin because the jenkins-killer already killed the instance
19:18 pdurbin My pull request was just merged so now we get to watch https://jenkins.dataverse.org/job/IQSS-dataverse-develop/251/ run. :)
19:18 pdurbin fingers crossed :)
19:29 pdurbin donsizemore: huh. "set jdbcurl" in 251 still failed with "The task includes an option with an undefined variable" even though I tried to define it.
19:31 pdurbin If it's easier and you have time I could call you earlier. :)
19:33 donsizemore i'll take a look
19:40 donsizemore @pdurbin that commit may not be what you want because jdbcurl points at the newest PG JDBC driver
19:40 donsizemore @pdurbin if you leave jdbcurl blank it uses the jar from dvinstall.zip
19:41 pdurbin hmm, but it was blank before
19:42 pdurbin want me to give you a ring?
19:42 donsizemore that's fine. i can start a zoom?
19:44 pdurbin sounds perfect
19:46 donsizemore https://unc.zoom.us/j/9192604915
19:52 donsizemore https://github.com/IQSS/dataverse-ansible/blob/master/tests/group_vars/vagrant.yml
19:59 pdurbin https://github.com/orgs/IQSS/projects/2#column-5298410
20:08 pdurbin http://guides.dataverse.org/en/4.17/developers/testing.html#measuring-coverage-of-integration-tests
20:33 poikilotherm joined #dataverse
20:51 pdurbin poikilotherm: you're back. How's it going? :)
20:51 poikilotherm Hi pdurbin
20:51 poikilotherm I'm hacking on the tests
20:52 poikilotherm I will introduce a locale test, that allows easy testing if that settings are in place. My IDE uses system default, which leads to interesting errors
20:55 poikilotherm to be honest: I already wrote it...
20:55 poikilotherm Wanted to be sure maven etc are working with correct locale settings
21:05 pdurbin nice
21:36 poikilotherm Wow. When I remove my unit test class completely, all test cases are running smoothly
22:01 poikilotherm Ok I tracked it down to my mocking of the faces context
22:02 poikilotherm That seems to crash the bundle handling
22:16 pdurbin gotcha, I did notice that new Faces thing
22:17 poikilotherm Simple solution...
22:17 poikilotherm Save the context before mocking, reset after all test cases
22:23 poikilotherm Meh. Still one failing test: SiteMapUtilTest.testUpdateSiteMap:91 expected null, but was:<java.io.FileNotFoundException: /tmp/sitemap.xml (No such file or directory)>
22:24 poikilotherm But that one is also failing on develop :-D
22:29 poikilotherm Success!
22:29 poikilotherm https://travis-ci.org/IQSS/dataverse/builds/611602730?utm_medium=notification&utm_source=github_status
22:30 poikilotherm BTW - shouldn't we use Jenkins + Coveralls for coverage instead of Travis?
22:32 pdurbin poikilotherm: well, I was thinking we could continue to let Travis measure code coverage of the unit tests but yes, I was thinking that Jenkins could some day measure code coverage of the API tests (which should be higher).
22:32 poikilotherm Yeah - coverage increase by +0.1%
22:32 poikilotherm I lost my bet...
22:33 pdurbin donsizemore: I just chatted with my officemate about OCR. He had me email his boss but I'd say it's likely that we can get you two talking soon.
22:33 pdurbin poikilotherm: every little bit counts!
22:35 poikilotherm Maybe IQSS can host a competition to earn a trophy for adding coverage
22:36 poikilotherm Stickers, badges, shirts, ...
22:37 poikilotherm I love the shirts by Payara. A really cool shirt for Dataverse might attract people ;-)
22:39 pdurbin You have no idea how many t-shirts I put in bags for fabric recycling as I was packing to move. :)
22:39 poikilotherm :-D
22:39 poikilotherm Did moving go well? Happy arrival?
22:39 pdurbin happy arrival
22:40 poikilotherm Glad to hear that. Congrats!
22:42 pdurbin Thanks! And I'm not trying to say that a Dataverse t-shirt is a bad idea. I just have too many t-shirts. :)
22:42 poikilotherm :-D
22:50 donsizemore @pdurbin i think we're good with develop. i'll let you know once the two running jobs finish and i restart them for good measure
22:57 pdurbin great!
22:57 pdurbin poikilotherm: I'm heading home but do you have a moment to talk about maps in the dataverse-installations repo?
22:59 pdurbin My thought is this: we should figure out some sort of namespace so that we can have lots and lots of maps. I miss your cluster map, for example. Right now we have one map at index.html in the root of the repo. If we had 20 maps, what directory structure should they be organized in?
23:00 poikilotherm Sure
23:00 poikilotherm Go ahead
23:00 poikilotherm Hmm
23:00 * poikilotherm scratches his head
23:01 poikilotherm would it make sense to use a flat structure?
23:01 poikilotherm directory per map?
23:02 poikilotherm There might be things in common ground like Javascript
23:02 pdurbin Sure, a directory per map is fine, I guess. I'm flexible. :)
23:02 poikilotherm Dunno if you already know about the changing policies at Firefox & Chrome about fetching Javascript from CDNs
23:03 poikilotherm These things and things like common css should live in a toplevel
23:03 poikilotherm So maybe having a directory /map with subdirs for the map might make sense
23:03 pdurbin Right, top level /map or /maps
23:04 pdurbin and stuff underneath
23:04 pdurbin But where would your cluster map go? /maps/cluster/index.html?
23:04 pdurbin Where will my stamen watercolor map go? /maps/stamen/index.html?
23:04 poikilotherm /map might be easier to read when using it in a url, but looking at the repo /maps is easier ;-)
23:05 pdurbin I'm ok with /map
23:06 pdurbin Anyway, I'm heading out into the cold but please think about it. :)
23:07 poikilotherm Most certainly such a flat structure makes sense. Most likely the maps will have different data files attached to it, will need js files, plugins, ... All of this needs to be referenced in an index.html
23:08 poikilotherm Replicating it for every map while moving common js/css/img into toplevel sounds like a solid first step

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.