Time
S
Nick
Message
07:50
jri joined #dataverse
08:20
stefankasberger joined #dataverse
09:05
poikilotherm joined #dataverse
09:05
MrK joined #dataverse
09:33
jri joined #dataverse
11:10
jri joined #dataverse
11:13
poikilotherm
Good morning America :-)
11:13
poikilotherm
Welcome aboard flight DV4181
11:15
jri_ joined #dataverse
11:52
pdurbin
poikilotherm: heh. Back on deck?
11:53
poikilotherm
TADA :-D
11:53
pdurbin
Great!
12:01
pdurbin
poikilotherm: it looks like you merged the latest into https://github.com/IQSS/dataverse/pull/6365 . Are you ready for me to send it to QA?
12:01
poikilotherm
Sure
12:04
pdurbin
poikilotherm: done! Can you please leave a comment on the pull request about how to test the changes? kcondon was asking questions like "Is Shib affected?"
12:04
poikilotherm
Sure. As soon as I finished that little thing for a colleague :-)
12:04
poikilotherm
Thanks for sending me that question :-)
12:05
poikilotherm
More like those around? These are easy to answer
12:12
pdurbin
Um. Did you see I gave you a shout out at https://groups.google.com/d/msg/dataverse-community/uKretKox_io/4FyPVAMYBgAJ ? :)
12:21
MrK joined #dataverse
12:29
poikilotherm
Yeah :-)
12:29
poikilotherm
Alright, that tiny other job is done...
12:29
poikilotherm
Now back to Dataverse hacking
12:30
pdurbin
:)
12:36
poikilotherm
Danny wrote: can you provide some guidance about areas of the code that had heavy changes and areas that you see as particularly complex and that have more risk
12:36
poikilotherm
I'm puzzled
12:37
pdurbin
Yeah, that's why I pinged you.
12:37
pdurbin
Don't worry about it.
12:37
poikilotherm
Aye
12:37
pdurbin
You can just say roughly what functionality you touched.
12:37
poikilotherm
That sounds like you guys really scratched your head over what I've done...
12:37
poikilotherm
Ok
12:38
pdurbin
QA will be testing for regressions.
12:38
poikilotherm
Maybe, just maybe, this would be a good thing to express via labels or in the description of a PR, so a template thing.
12:38
pdurbin
But not regressions across every bit of functionality in the app.
12:38
pdurbin
Sure, sounds fine.
12:39
poikilotherm
The K8s people and other these days often flag things with labels like "feature/x" and "risk/y" etc
12:40
poikilotherm
https://github.com/GoogleContainerTools/skaffold/pulls
12:40
poikilotherm
https://github.com/kubernetes/kubernetes/pulls
12:43
pdurbin
I can't find any examples of "feature/x" or "risk/y" but I'll take your word for it. :)
12:44
poikilotherm
It's not precisely what they use, but we could adapt it to such labels
12:44
pdurbin
ok
12:45
pdurbin
You've done a good job of making code review easy. Small diff. Tests added. Now you can try to make QA easy. "Based on what I changed, I suggest testing the following:"
12:45
poikilotherm
K8s is using "feature" in a more differentiated manner like "sig" and "area"
12:46
poikilotherm
That sounds really nice as a template string
12:46
poikilotherm
Want me to create an issue?
12:47
pdurbin
sure!
12:48
poikilotherm
Or should I add a comment to https://github.com/IQSS/dataverse/issues/6226
12:49
pdurbin
hmm, that would probably be better
12:57
poikilotherm
Done.https://github.com/IQSS/dataverse/issues/6226#issuecomment-557073166
12:58
pdurbin
looks good, thanks
12:58
poikilotherm
Sure.
12:59
poikilotherm
Another thing that would be very helpfull for the community :-)
12:59
poikilotherm
Better communication :-)
13:00
poikilotherm
Should I add my comments about testing for #6365 in a comment or in the description?
13:00
poikilotherm
Comments sometimes tend to get lost
13:02
pdurbin
In this case I think a new comment would be best. Since there has already been some chatter in that pull request.
13:03
poikilotherm
Hmm... Let's do both. I'll add it like you had it above as a section caption to the description and add a comment with pings about the changed description.
13:03
pdurbin
QA tries to read all of the comments in the issue and all of the comment in the pull request but sometimes there are many, many comments (especially in issues) and what QA wants is an understanding of the latest code, the code that actually needs to be tested. Not code that has changed over and over in code review.
13:05
pdurbin
(Very little of this applies in your pull request, which is small and targeted. And basically no chatter in the issue.)
13:05
poikilotherm
Yeah. But I get your point
13:06
poikilotherm
So might it be an idea to enforce people to flag the description as QA ready?
13:06
poikilotherm
So QA doesn't have to read all the stuff in the comments?
13:06
poikilotherm
If I where to do QA I would hate to scroll through all of the comments
13:07
poikilotherm
Like it happened in the PR for the Microsoft OAuth2 stuff
13:12
poikilotherm
I added some thoughts to https://github.com/IQSS/dataverse/issues/6226#issuecomment-557078413
13:15
pdurbin
good thoughts, good additions
13:16
pdurbin
In practice, Kevin usually stops by to chat with me about pull requests that I've either made or reviewed. So what's written is important but a quick chat also helps.
13:16
poikilotherm
Chatting is always usefull
13:17
poikilotherm
Or even a video call
13:17
pdurbin
yep
13:20
donsizemore joined #dataverse
14:03
MrK joined #dataverse
14:08
amay02 joined #dataverse
14:10
amay02
Quick question: for self-deposited objects is it possible for admins to mediate the metadata at a later date?
14:23
pdurbin
amay02: well, a popular workflow is to allow authors the ability to create datasets and fill in as much metadata as possible and then click "Submit for Review" at which point a curator takes a look and either clicks "Return to Author" or "Publish". At any time, the curator can make edits to the dataset. You can read a bit more about this at
14:23
pdurbin
http://guides.dataverse.org/en/4.18/user/dataset-management.html#submit-for-review
14:34
amay02
Thanks!
14:55
donsizemore
@pdurbin good morning @poikilotherm guten Tag — do y'all have a minute to talk about #6124?
14:56
poikilotherm
Sure
14:56
poikilotherm
Go ahead
14:56
* poikilotherm
looks at java.time.Clock and others for faking clock in unit tests, so tests don't fail anymore...
14:57
donsizemore
should i go ahead and cobble http://guides.dataverse.org/en/4.17/developers/testing.html#measuring-coverage-of-integration-tests into dataverse-ansible or do we want to pursue the maven route
14:59
poikilotherm
For me, running this via Maven makes most sense, as it is independent. I can reuse it in other tool like docker, k8s etc
15:00
poikilotherm
But I don't know details about your plans of "cobbling it into ansible".
15:00
poikilotherm
Please enlighten me with more details
15:01
donsizemore
pete and phil had been doing this https://github.com/IQSS/dataverse/blob/738405892ec90d23c61774e402be9fc3fceb7bcc/doc/sphinx-guides/source/_static/util/instrument_war_jacoco.bash
15:04
poikilotherm
Ok so they do offline instrumentation, right?
15:05
poikilotherm
pdurbin: was there a particular reason to do so instead of using the agent variant?
15:05
donsizemore
correct. and i see this https://automationrhapsody.com/code-coverage-with-jacoco-offline-instrumentation-with-maven/
15:05
poikilotherm
Seems like it boils down to https://www.jacoco.org/jacoco/trunk/doc/agent.html vs https://www.jacoco.org/jacoco/trunk/doc/offline.html
15:07
pdurbin_m joined #dataverse
15:07
pdurbin_m
poikilotherm: no particular reason apart from getting anything working quickly
15:08
poikilotherm
pdurbin_m: did you try with the agent way and failed?
15:08
poikilotherm
That might be a timesaver ;-)
15:10
pdurbin_m
I did not.
15:10
pdurbin_m
I confirmed that Pete's approach worked and added to his docs.
15:11
poikilotherm
OK.
15:11
poikilotherm
Reasons why you went with jacoco CLI for instrumentation instead of Maven target?
15:12
donsizemore
the agent would be launched by a JVM option, and not something we'd want for general use warfiles, correct?
15:13
poikilotherm
donsizemore: yes. It's just like I do for JRebel, JMX profiling et al
15:13
poikilotherm
You need to place the agentjar at some useable place and configure domain.xml to start the JVM with the agent
15:13
poikilotherm
Pretty straight forward. On the fly instrumentation.
15:13
donsizemore
excellent
15:14
poikilotherm
IMHO we should try that one first
15:14
poikilotherm
It could save a lot of headaches
15:14
poikilotherm
Like how to collect the results, etc
15:14
donsizemore
i had an un-pushed branch to implement the work-around solution but i'll move it aside and fart around with the agent today
15:15
poikilotherm
pdurbin_m: do we need to provide a fallback solution / keep the instrumented variant alive?
15:17
donsizemore
@poikilotherm the agent doc above says "If you use the JaCoCo Ant tasks or JaCoCo Maven plug-in you don't have to care about the agent and its options directly. This is transparently handled by the them."
15:17
donsizemore
@poikilotherm so all i need is the jar and the jvm option?
15:17
poikilotherm
Beware
15:17
poikilotherm
This will only be true for running test "locally"
15:18
poikilotherm
But you want to run tests on a remote end
15:18
poikilotherm
When we would use an embedded app server, we might benefit from that
15:18
poikilotherm
But you spin up everything remotely
15:19
poikilotherm
This might be a good example to get inspired from: https://github.com/piczmar/maven-jacoco-remote
15:22
poikilotherm
You should also think about NOT using surefire, but failsafe maven plugin for IT tests
15:22
poikilotherm
https://stackoverflow.com/questions/28986005/what-is-the-difference-between-the-maven-surefire-and-maven-failsafe-plugins
15:25
pdurbin_m
poikilotherm: again, I was just confirming that a solution works. I am able to get reports of API test code coverage now, through manual effort. The next step is to add it to Jenkins. :)
15:25
poikilotherm
:-)
15:26
poikilotherm
Yeah. :-)
15:26
poikilotherm
I propose it will be the easiest way to run the agent on the remote EC2 instance, collecting the coverage report locally on the jenkins machine when running the integration tests via maven failsafe
15:28
pdurbin_m
donsizemore: are you following all that? I'm still at the gym. :)
15:29
poikilotherm
If donsizemore finds it easier to instrument the classes first and load them as a package to EC2, that'd be fine too.
15:29
poikilotherm
Moving more stones, though
15:38
donsizemore
i'm following and like piczmar's stuff on principle
15:42
pdurbin
donsizemore: awesome
15:45
poikilotherm
pdurbin: https://github.com/IQSS/dataverse/pull/6365#issuecomment-557143466
15:47
pdurbin
poikilotherm: timeouts, huh? Are we still trying to get this into QA in the next half hour?
15:47
poikilotherm
I'll go on a hunt now. Kids waiting...
15:47
poikilotherm
Read you guys tomorrow
15:47
pdurbin
sounds like not :)
15:48
pdurbin
donsizemore: thanks for looking into the code coverage stuff. Again, I haven't been following the conversation very closely. Is there anything you need from me to help move it forward? Or anything else? Did you figure out which monitor to buy? :)
15:51
pdurbin
xarthisius: I just added https://github.com/jupyterhub/binderhub/pull/969 to the agenda for a JupyterHub/Binder Community Call that's supposed to start in a little over an hour. Right now it's first on the agenda! Are you interested in joining? Details at https://discourse.jupyter.org/t/jupyterhub-binder-community-call-november-2019/2471
15:51
donsizemore
@pdurbin so, workflow. who's going to run the tests? who's going to have access to the remote target? and in the spirit of babysteps/granular changes, is testing locally a god first step?
15:51
pdurbin
Everyone here is welcome, of course!
15:51
donsizemore
because the scope just increased (and for the better) but small changes are the name of the game
15:52
pdurbin
donsizemore: I'm going to need to grab a room to call in to that Binder call anyway so would you like me to give you a call first? Just after standup? Maybe aroun 11:30?
16:10
donsizemore
@pdurbin i'm about to grab lunch with a former boss, any chance this afternoon?
16:30
pdurbin
donsizemore: sure! That actually gives me time to hunt down a Dataverse DOI to try with Binder. Can you suggest any DOIs from UNC Dataverse that have a Jupyter Notebook or Python or R?
17:15
Jim95 joined #dataverse
17:19
Jim95
@pdurbin - Do you know if MDC/counter has been tested with log entries for non-published datasets? I'm getting errors since it looks like entries from events on non-published datasets are going into the MDC log, but without an identifier, etc. which then causes counter_processor to barf.
17:19
Jim95
If that's a real issue versus some misconfig/misunderstanding of the setup on my part, I can dig into it.
17:54
Jim95
actually, looking a bit more it may be specific api calls rather than specific datasets.
18:05
Jim95
Strangely https://dataverse-dev.tdl.org/api/v1/datasets/:persistentId/versions/:latest/files is one of those calls and I don't even see an MDC logging call in that method...
18:47
pdurbin
Jim95: hi! Sorry, I was just on a call about the Binderverse and haven't had lunch yet. There's free food on the floor below me that's going fast and then I'm going to take a little walk (first sunny day in quite a while). I don't think drafts should be logged. Also, can you help diagnose this TDL issue? https://groups.google.com/d/msg/dataverse-community/C-HLdPQwf70/JUNAZc2DBQAJ
18:54
donsizemore
@pdurbin scaring up a DOI now
18:54
donsizemore
@pdurbin in the mean time, check out Jon: https://youtu.be/ZvQQi2Z3hzI?t=441
19:00
Jim95
@pdurbin - no rush. I knew you were away. Note that I don't think it's draft stuff now - it's some (api?) calls. I see the same thing at QDR and TDL so I think it's a real issue though I'm confused at this point as to where the trigger for the MDC logging is...
19:01
Jim95
For the TDL issue - I can get you any info you need from the DB , but I don't know much about harvesting in general.
19:07
donsizemore
@pdurbin R https://dataverse.unc.edu/dataset.xhtml?persistentId=doi:10.15139/S3/YCSYUN
19:36
pdurbin
donsizemore: I have a good feeling this belongs on DataverseTV
19:36
pdurbin
Jim95: I was sort of hoping you could look at some export files on the file system, actually. Or we can talk MDC first. Up to you. :)
19:39
pdurbin
donsizemore: let's give this a try: https://mybinder.org/v2/zenodo/10.15139/S3/YCSYUN/
19:40
donsizemore
it look kind of like co-ray-ray!
19:40
pdurbin
heh
19:42
pdurbin
donsizemore: oh, did you still want to do a quick video call?
19:43
donsizemore
whenever you have time
19:43
pdurbin
I have half an hour before my one on one with Danny. Please shoot me a zoom link or whatever if you're ready.
19:44
donsizemore
https://unc.zoom.us/my/sizemore
19:46
donsizemore
My session is taking longer than usual to start!
20:00
Jim95
@pd
20:00
Jim95
@pdurbin - what do you need me to look at on the file system?
20:06
Jim95
guessing - looks like the cached export files are from 2019-02-27 and didn't get updated when we went to 4.17 - does that help?
20:23
pdurbin
Jim95: yes! That was my question... why does the UI say one date but harvesting says another?
20:24
donsizemore
@pdurbin mandy say "One thing to note (not sure if this is for Phil or just a thing to note....) but when you click on a file and it opens it up in jupyter...the Visit Repo button has a broken link in it...it looks like it is throwing zenodo into the doi which is breaking the page, so it doesn't resolve to the SPPQ Dataset record."
20:30
Jim95
OK - I haven't followed the details of the issue, but assuming this means we need to re-export, I'll go chat with TDL to get that done...
21:03
pdurbin
cool
21:04
pdurbin
donsizemore: oh! I didn't know there's a Visit Repo button!
21:09
yoh_ joined #dataverse
21:09
pmauduit_ joined #dataverse
21:14
pdurbin
donsizemore: also, here are the HTML reports that you are already creating for us (thanks!) for *unit* test coverage: https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ws/target/site/jacoco/index.html
21:16
pdurbin
And if you scroll around any of the files you can see red vs yellow vs green on a line by line basis: https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ws/target/site/jacoco/edu.harvard.iq.dataverse.export.dublincore/DublinCoreExportUtil.java.html
21:46
pdurbin
https://dataverse.harvard.edu and https://demo.dataverse.org have been upgraded to Dataverse 4.18.1.