Time
S
Nick
Message
07:46
YSF17 joined #dataverse
07:55
jri joined #dataverse
07:57
poikilotherm joined #dataverse
08:08
juancorr joined #dataverse
09:32
dataverse-k8s joined #dataverse
09:38
dataverse-k8s left #dataverse
09:42
dataverse_k8s_95 joined #dataverse
09:44
dv_k8s_|60 joined #dataverse
09:44
dv_k8s_|60 left #dataverse
09:55
jri joined #dataverse
10:13
pdurbin joined #dataverse
10:15
pdurbin
bjonnh: yes, we can certainly talk about zips and Quarkus and Spring Boot and GraalVM this week. :)
11:31
yoh joined #dataverse
12:07
poikilotherm
Morning pdurbin :-)
12:07
poikilotherm
Hope you had a great weekend
12:22
donsizemore joined #dataverse
12:52
pdurbin
The demo went very well. Beautiful drive to spend the night with some friends who have a vacation house two hours away in the mountains surrounded by autumn color. How about you?
12:53
pdurbin
How about everybody here? Good weekend?
13:00
poikilotherm
I've been relaxing a bit. Catched a cold, slept some hours. Back on deck :-)
13:04
bjonnh
pdurbin: great
13:04
pdurbin
In baseball, when you're on deck, you're waiting for the batter to finish. You're next.
13:05
donsizemore
@pdurbin there's no crying in baseball!
13:07
bjonnh
regarding the zip, we have users that have datasets per molecule, and each of these datasets are RAW data. These contain a lot of files that are named the same but in different directories (not our choice, that's how they are produced and consumed). So when we upload double zips, everything gets unzipped and then dataverse complains that there are files with the same name. And if ones want to open one of
13:07
bjonnh
these datasets, they would have to download manually 20 files… At least on dataverse instance, I couldn't find a way to avoid that behavior.
13:07
pdurbin
donsizemore: when I Skyped with Sherry last week you should have heard me talking about my poor dad and how we watched the Cardinals lose to Sherry's team (the Nationals) while he was visiting (and keeping my kids up too late on a school night).
13:08
bjonnh
other than that, great week end. And now I am feeding cats.
13:08
pdurbin
bjonnh: hmm, the API might have a force=true option. Do you know about https://github.com/IQSS/dataverse-sample-data ?
13:08
bjonnh
Yeah I can't really get my users to use the API ;)
13:09
pdurbin
bjonnh: sure but let's make a sample dataset to illustrate the problem. We could use some non-social science datasets in that sample data repo anyway.
13:09
bjonnh
Ok, I'll get that prepared
13:10
pdurbin
bjonnh: Thanks! For guidance on file sizes, please see https://github.com/IQSS/dataverse-sample-data/blob/master/CONTRIBUTING.md
13:13
bjonnh
size isn't really an issue here
13:13
bjonnh
oh 10M
13:13
pdurbin
Size is an issue for that sample data repo. We are trying to keep it lean and mean. :)
13:13
bjonnh
sorry
13:13
bjonnh
hmmm
13:13
bjonnh
ok
13:13
bjonnh
that should still work
13:13
pdurbin
Phew.
13:14
pdurbin
bjonnh: I'm happy to give you push access to the repo. Or you can fork it. Lemme know.
13:14
bjonnh
the problematic ones are <1M by acquisition (usually more in the 256K)
13:14
bjonnh
so 2 of these would be sufficient
13:15
pdurbin
Smaller is better for that repo.
13:15
poikilotherm
pdurbin bjonnh: you might look into Git LFS for larger datasets
13:15
poikilotherm
Git doesn't handle large binary data very well
13:16
pdurbin
poikilotherm: I want `git clone` to be as fast as possible. Would Git LFS help with that?
13:16
bjonnh
sure, but that's not an issue here. The issue is more on how to share sets of files that make sense only together
13:16
poikilotherm
pdurbin: yes. Definitly
13:16
pdurbin
poikilotherm: huh, ok. Please lead the way. :)
13:16
bjonnh
you've been voluntold :D
13:16
poikilotherm
That's why LFS has been invented in the first place :-D
13:17
bjonnh
yes there is LFS and I heard about another one recently that allows to share files between repos as well (didn't test)
13:18
poikilotherm
Yeah, git annex propably
13:18
poikilotherm
LFS is the modern one ;-)
13:18
poikilotherm
And supported by Gitlab, github et al
13:18
poikilotherm
https://git-lfs.github.com/
13:18
poikilotherm
https://github.com/git-lfs/git-lfs/wiki/Tutorial
13:18
poikilotherm
https://help.github.com/en/github/managing-large-files/configuring-git-large-file-storage
13:18
pdurbin
The author of git annex and I hang out every year at LibrePlanet. All are welcome. The call for proposals is open. :)
13:19
pdurbin
We had burritos with yoh
13:19
pdurbin
Or was it tacos? Yes, tacos.
13:19
poikilotherm
Well I think it's a matter of support. GitHub and GitLab seem to support Git LFS only these days
13:19
poikilotherm
And people tend to use what's easier to use ;-)
13:21
donsizemore
@pdurbin how do we feel about disabling SSL cert verification in dataverse-metrics' download.py?
13:22
pdurbin
donsizemore: I'm feeling real good about it
13:23
donsizemore
@pdurbin 10/4 (i'm hoping to peck off dataverse-metrics #26)
13:24
pdurbin
donsizemore: thank you!!
13:25
pdurbin
donsizemore: while you're in there, do you want to add a couple more installations?
13:25
donsizemore
sure thing
13:25
pdurbin
thanks!
13:30
stefankasberger joined #dataverse
13:43
donsizemore
it's https://data.lipi.go.id — firefox and chrome like their cert, python doesn't
13:45
donsizemore
also, and this could be a terrible thing, but they're using tawk.to to offer live archivist chat on their DV homepage
13:54
pdurbin
They are also one of four installations on the map that are using the FAKE DOI provider. :(
14:11
pdurbin
donsizemore poikilotherm: would it be two heavy handed for update-data.py in https://github.com/IQSS/dataverse-installations to emit a warning for those four installations?
14:12
poikilotherm
Warning? What kind of warning?
14:12
pdurbin
"Installation [hostname] is on the map but using test DOIs."
14:17
pdurbin
We could have other warnings too:
14:17
pdurbin
"Installation [hostname] doesn't have a description."
14:18
pdurbin
"Installation [hostname] doesn't have a launch year."
14:18
pdurbin
Does that make sense?
14:21
poikilotherm
Well it would make sense to create a linter for the props
14:21
poikilotherm
All of 'em
14:21
poikilotherm
This would be very usefull for any kind of PR validation ;-)
14:21
poikilotherm
BTW did you see my suggestion about switching to geoJSON?
14:22
pdurbin
Sure but I care much more about some of the fields than others. Right now I want launch year: https://github.com/IQSS/dataverse-installations/issues/7
14:22
poikilotherm
Well, a linter can be extended ;-)
14:23
poikilotherm
IMHO we should discuss about switching to geoJSON first, because refactoring scripts directly after creating them is frustrating... ;-)
14:25
pdurbin
The script is tiny right now. I'm not worried. And you should include the maintainer, shlake, in this discussion. :) Also, I have a thought for you on GeoJSON (it's already supported by Dataverse, in a way) but let me finish a couple things first.
14:26
poikilotherm
Sure. I'm hacking on k8s docs
14:28
pdurbin
poikilotherm: here, I just tagged you in this comment: https://github.com/IQSS/dataverse-installations/issues/7#issuecomment-546971764
14:38
poikilotherm
pdurbin: https://dataverse-k8s.readthedocs.io/en/latest/index.html
14:39
poikilotherm
You might notice the little IRC badge and what's behind it ;-)
14:40
pdurbin
poikilotherm: is the idea that someone with a nick like "dataverse_k8s_88" will probably have a question about dataverse-kubernetes?
14:41
poikilotherm
Aye
14:41
poikilotherm
That nick is auto-generated to avoid nick duplication
14:41
poikilotherm
But I thought it might be easier to track if someone is coming that way
14:42
pdurbin
Sure, let's try it.
14:42
pdurbin
We believe in experimentation. We believe in science. :)
14:42
* poikilotherm
*thumbs up*
14:44
poikilotherm
We could point people to a dataverse.org installation of a chat UI, but as long as it has memory leaks, I thought it would be better to point them to some other places... ;-)
14:46
pdurbin
absolutely, thank you
14:46
stefankasberger joined #dataverse
14:46
pdurbin
poikilotherm: just left a comment about geojson: https://github.com/IQSS/dataverse-installations/issues/19
15:05
pdurbin
donsizemore: thanks for https://github.com/IQSS/dataverse-metrics/pull/34 ... do you have time for a quick question about it (and pull requests in general)? :)
15:05
donsizemore
sure
15:06
donsizemore
(i had actually PMd you in slack to ask a couple questions)
15:07
pdurbin
Oh! Sorry, I don't always launch Slack right away. I try to be SLOPI first. :)
15:08
pdurbin
Mostly I just wanted to let you and all other people who make pull requests in here that we have recently switched to using the magic "closes" or "fixes" keywords in the description of GitHub issues. They are completely optional and I don't always use them, if I don't want the issue to be automatically closed.
15:09
donsizemore
to paraphrase Mary Poppins: "I never close or fix anything"
15:09
pdurbin
This was part of our move from Waffle to GitHub Projects.
15:09
pdurbin
heh
15:10
donsizemore
but I can start using them if you like
15:10
pdurbin
Sure, if you want manual mode, don't use those magic words. :)
15:10
pdurbin
It's really up to you. I'm just trying to explain the recent change.
15:10
pdurbin
I think we wrote about it in the dev guide. Lemme check.
15:11
pdurbin
Huh. I guess we haven't updated the guides yet.
15:13
pdurbin
Oh, it isn't in the dev guide (but probably should be). I found it.
15:14
donsizemore
well, see, I always _think_ I fix something, then...
15:14
poikilotherm
donsizemore pdurbin would you let me know what you think about https://dataverse-k8s.readthedocs.io/en/latest/get-started/demo/k3s.html from a standpoint of someone that's interested and wants to learn from the docs?
15:14
pdurbin
'Note that we use the "closes" syntax below to trigger Github's automation to close the corresponding issue once the pull request is merged.' https://github.com/IQSS/dataverse/blob/v4.17/PULL_REQUEST_TEMPLATE.md
15:15
pdurbin
poikilotherm: do those docs close or fix https://github.com/IQSS/dataverse/issues/4665 ? :)
15:15
poikilotherm
Yes. Definitly
15:16
poikilotherm
These docs are really maturing now
15:16
poikilotherm
Slava et al will use it in TimeMaschine etc
15:23
donsizemore
@poikilotherm looks great, though "Pick your poison" may sound daunting
15:26
poikilotherm
Good catch.
15:26
poikilotherm
Corrected
15:26
poikilotherm
(Might take a few minutes till visible online)
15:29
andrewSC joined #dataverse
15:43
xarthisius
pdurbin: I saw your question about "Preview". I'm not sure WT fits in the same category as PDF/csv viewers, but that's your UI :)
16:31
pdurbin
xarthisius: I guess I was thinking that you could use the "Preview" screen real estate to advertise Whole Tale if you want, especially since the "Preview" tab will become the new default (the "Metadata" tab is currently the default).
16:32
pdurbin
Here's the latest "Preview" code. Feedback is welcome! http://ec2-34-230-54-112.compute-1.amazonaws.com:8080/file.xhtml?fileId=8&version=1.0
16:38
xarthisius
I'll discuss it with the team, but I'm not sure what we could show there
16:39
xarthisius
I'm personally fine with just being an option in the "Explore" button in the top-right corner :)
17:27
pdurbin
xarthisius: just at the datset level, right?
18:10
donsizemore
@pdurbin knock knock?
18:13
pdurbin
donsizemore: hi! And sorry I never replied in Slack. We can talk there if you prefer.
18:13
donsizemore
nah, i'm good. you wanted me to try payara 5 again, which I just did with 4.17
18:14
pdurbin
donsizemore: oh! And? :)
18:14
donsizemore
I got the same results you did in 5907, though it looks like @poikilotherm hit the nail on the head in 6216
18:15
donsizemore
the only thing my stupid self would have to add is that, even though i specify the FAKE DOI provider, my first errors in the log are a login failure and resulting response chiding from DataCite
18:15
donsizemore
i'm wondering a) whether the problems are related / sequential, and b) which is hit first. probably DataversePage.init()
18:16
pdurbin
Did you test with 5.192?
18:16
pdurbin
Sorry, Payara 5.192?
18:16
donsizemore
5.193.1
18:17
donsizemore
so i'm wondering which issue to update, and whether what i found is even helpful
18:17
pdurbin
Oh! Interesting! And Dataverse deployed ok, it sounds like. I had bad luck with one run using Payaya 5.193 (no .1). Maybe they fixed something. :)
18:18
donsizemore
it deployed but there's no root dataverse (and i can't seem to create one manually)
18:18
pdurbin
donsizemore: I'm re-reading what I wrote back then at https://github.com/IQSS/dataverse/pull/6220#issuecomment-535219151
18:18
pdurbin
"note that I'm using payara-5.193.zip for the first time instead of payara-5.192.zip"
18:19
donsizemore
hmmm. i used dataverse-ansible but things went more or less okay
18:19
pdurbin
Do you feel like trying that branch? https://github.com/poikilotherm/dataverse/tree/6216-broken-dataverse-view ?
18:20
pdurbin
I think the most useful thing would be to try that branch of Oliver's and leave a comment on the results.
18:21
donsizemore
sure thing - give me just a minute
18:21
pdurbin
thank you!
18:23
donsizemore
oh, if you want me to do a git reset HEAD or whatever on the OdumInstitute fork and submit that PR again i will
18:23
donsizemore
i was testing against the develop branch on our fork when i should've been testing from one branch to another
18:25
donsizemore
running ^^ but with payara-5.193.1
18:28
pdurbin
latest payara is perfect, thanks! and yeah a fresh pull request would be great if you don't mind
18:29
donsizemore
testing a CentOS 8 AMI is also on my list (i'm finally having a quiet week!)
18:31
pdurbin
Nice. Would now be a good time to go through the list we talked about and close issues, etc.?
18:32
donsizemore
i hit a couple of them. let me make a pass before i pester you and gustavo again
18:32
pdurbin
sounds good
18:38
donsizemore
bah. died in solr config. yeah, the PR fell behind, but i don't want to step on @poikilotherm's toes
18:42
pdurbin
Ah, want me to leave a comment asking him to merge the latest?
18:43
donsizemore
either that or i can check out the branch, merge it, push it to my own branch and test
18:51
donsizemore
(which i'm doing now)
18:52
pdurbin
sure!
19:26
donsizemore
http://ec2-3-87-120-91.compute-1.amazonaws.com
19:27
donsizemore
same results i got on develop - first warning is a login failure with datacite
19:28
pdurbin
Ok, so this is 6216-broken-dataverse-view-9ae45f4 ... this means it's the latest from https://github.com/IQSS/dataverse/pull/6220 with some commits merged from develop?
19:34
donsizemore
it merged cleanly with develop (and produced the same behavior as develop using payara-5.193.1)
19:35
pdurbin
Ok. Seems like it needs more work, I guess. That pull request, I mean. It doesn't seem to fix anything.
19:37
donsizemore
i can attach the logs if that's helpful
19:38
pdurbin
That's ok. Mostly I just want to ask Oliver about the status of it. But thanks for testing it! Should we leave a comment in the pull request or just wait to talk to him here?
19:43
donsizemore
@pdurbin eh, i tested it with payara-5.193.1; he submitted the PR against glassfish-4.1
19:47
pdurbin
hmm, ok
19:47
pdurbin
He wants us to get off Glassfish 4. The comment you just left is perfect. :)
19:47
donsizemore
(I can test that as well if you like)
19:47
pdurbin
No, I'd say we're done here. Done with that pull request until we hear back from him. Thanks again for testing.
19:49
pdurbin
donsizemore: the most important thing I learned today from your testing is that Dtaverse deploys to Payara-5.193.1. That means we can work on stuff like https://github.com/IQSS/dataverse/issues/5794 if we feel like it.
19:51
donsizemore
i suppose next is to find the correct modification to logging.properties to see why root dataverse creation is dying
19:51
pdurbin
well, hold on... the root dataverse is successfully created by the installer (via API ), right?
19:54
donsizemore
i thought the root dataverse was the validation error
19:54
donsizemore
and the homepage reports "Dataverses (null)"
19:55
pdurbin
well, if we dig into the database, I assume there's a single row in the "dataverse" table for the root
19:57
donsizemore
you're right. 1 | | root | UNCATEGORIZED | The root dataverse. | t | f | t | Root | t | f | t | 6 |
19:58
pdurbin
Also, I betcha you can create more dataverses via API .
19:58
pdurbin
What's broken is JSF.
19:58
donsizemore
i tried to create one with the sample JSON from the guide, but it threw an "entities missing(?)" error
19:58
pdurbin
yikes
19:59
pdurbin
Does the API test suite pass on it?
19:59
pdurbin
or at least DataversesIT?
20:00
donsizemore
oh, if i use the JSON from the guide and the create-dataverse example from the guide, i get
20:00
donsizemore
"Error parsing Json: JsonParser#getObject() or JsonParser#getObjectStream() is valid only for START_OBJECT parser state. But current parser state is VALUE_STRING"}d
20:00
pdurbin
sounds like a bug in the guides :(
20:00
donsizemore
lemme throw the whole thing away and try on IQSS/dataverse-develop with API test suite enabled
20:01
pdurbin
sounds good
20:01
pdurbin
I expect the Private URL test to fail because it exercises JSF.
20:01
pdurbin
That test uses a Private URL token to see if it can read the title from the HTML of a dataset.
20:01
pdurbin
But if JSF is broken, this will fail.
20:02
donsizemore
it's running, but it will take a while. i'll check back after the gym!
20:02
pdurbin
awesome, awesome, thanks so much
21:21
donsizemore
so, develop on payara-5.193.1 with api test suite enabled: 2 failed
21:22
donsizemore
AccessIT: org.opentest4j.MultipleFailuresError: Multiple Failures (2 failures) Cannot get property 'files' on null object expected:<200> but was:<403>
21:23
donsizemore
DataversesIT: edu.harvard.iq.dataverse.api.DataversesIT:282
21:37
pdurbin
Not bad! Payara here we come? :)