Time
S
Nick
Message
07:30
juancorr joined #dataverse
07:46
juancorr joined #dataverse
08:42
Benjamin_Peuch joined #dataverse
08:43
Benjamin_Peuch
Hello everybody.
08:43
Benjamin_Peuch
Does anyone here use Dataverse's Provenance feature with PROV-JSON files?
08:59
jri joined #dataverse
10:02
poikilotherm
donsizemore: FZJ has 134.94.0.0/16
10:02
poikilotherm
donsizemore: please remember to allow Github, too ;-)
10:03
poikilotherm
Dunno if they have public known ranges
10:03
poikilotherm
Wow they have an API for that! https://help.github.com/en/github/authenticating-to-github/about-githubs-ip-addresses
10:30
stefankasberger joined #dataverse
10:32
stefankasberger
Hey gyus. Am in the middle of the Dataverse upgrades. Have set up some nice pytest selenium tests for Dataverse. We also created some test data, for which i wrote some pyDataverse scripts to handle them. So I am pretty happy about what is going on. I guess, some of this could be interesting to some of you. Will try to open up as much as possible after the upgrade. Not that easy, because some internal stuff is inside (copyr
10:32
stefankasberger
ight issues, security issues).
10:50
poikilotherm
Hi Stefan :-)
10:51
poikilotherm
Good to hear from you and that you're making progress.
10:51
poikilotherm
Wish you all the best :-)
11:05
pkiraly joined #dataverse
12:48
GustavoMartins joined #dataverse
12:54
poikilotherm
Hi GustavoMartins
13:09
GustavoMartins
Hi poikilotherm
13:09
poikilotherm
How may we help you?
13:10
poikilotherm
Not many guys from IQSS around due to vacation...
13:20
GustavoMartins
I'm new to the Dataverse world. Currently I'm studying the docs, forums, GitHub issues and IRC logs to learn and see if I can contribute in any way.
13:28
poikilotherm
:-D You're most welcome
13:28
poikilotherm
Usually a few people from around the world hang out here
13:28
poikilotherm
At least a part of the community exchanges thought and ideas, helps others with their installation and problems etc
13:29
poikilotherm
Also lots of dev talk
13:29
poikilotherm
As our "Chief Community Officer" (I call him that) Philip Durbin is not here, it's a bit quiet these days
13:29
poikilotherm
He'll be back next week
13:30
poikilotherm
If you have any questions, just paste them here. We're all pretty responsive, but sometimes timezones are difficult.
13:32
poikilotherm
Oh and there are a ton of options how to contribute.
13:35
pkiraly
@GustavoMartins: in the issue queue you can find tickets labelled as "Help wanted: ..." In those tickets helps is extremely welcome. You can help in reproducing things, revising/modify documentation or with coding
13:38
pkiraly
GustavoMartins: if you are familiar with Solr, you can take a look this one: https://github.com/IQSS/dataverse/issues/5989 . It requires a change in Solr config and we are waiting for feedback on whether the suggested change works for you or not.
13:39
poikilotherm
pkiraly: is this a working PoC?
13:41
pkiraly
poikilotherm, yes
13:41
poikilotherm
Nice!
13:42
poikilotherm
I could create a container image flavor with it if you want
13:42
poikilotherm
Oh wait, that needs a Dataverse image too
13:43
pkiraly
poikilotherm: and it is a preliminary step to improve Dataverse usage of Solr in schema management, i.e. a smoother management of custom metadata blocks
13:43
poikilotherm
Nice!
13:54
poikilotherm
pkiraly I read the issue again - are you saying it's sufficient to switch to ManagedSchemaFactory?
13:55
poikilotherm
No changes to Dataverse code necessary?
13:57
pkiraly
No Dataverse changes needed
13:57
poikilotherm
So what happens when the metadata schemas change?
13:57
poikilotherm
How does Solr deal with changed fields, etc?
13:58
poikilotherm
Can I transparently switch from the old way to the new?
13:58
poikilotherm
(Without needing to reindex)
13:59
pkiraly
In Solr you can change the schema in 3 ways (after this change): 1) the same way as before (modifying the managed-schema file - now the same file is called schema.xml) 2) via the API 3) via the Solr admin UI
14:00
poikilotherm
Aha! So switching to the other factory is a necessity to change the Dataverse code to make use of the API
14:00
pkiraly
Theoretically no need to reindex after this change, because we do not change the index, neither the schema.
14:01
pkiraly
poikilotherm: yes. Right now the Solr's Schema API is turned off. When it will be on, we can modify Dataverse to use this API.
14:02
poikilotherm
Does the change deal with the XML includes=
14:02
poikilotherm
?
14:04
pkiraly
I don't know. I did not tried it
14:05
poikilotherm
Do you feel like creating an issue for me at IQSS/dataverse-kubernetes?
14:06
poikilotherm
I'd be willing to create a flavor for it. Will need some tweaks to some scripts for updating the new file, as long as the API is not used
14:06
pkiraly
OK, I'll create it
14:06
poikilotherm
Thx
14:09
Benjamin_Peuch
Thanks for the info about Philip's whereabouts, poikilotherm. Calling him a CCO sounds just right to me. :)
14:09
Benjamin_Peuch
(As long as it's not CC0. :p)
14:10
poikilotherm
That#s why I sent him https://twitter.com/philipdurbin/status/1214240481527250945
14:18
Benjamin_Peuch
Hahaha, that's one awesome gift.
14:18
Benjamin_Peuch
You got the font of the Dataverse logo just right.
14:19
Benjamin_Peuch
Do you remember which one it is?
14:22
poikilotherm
Lemme go looking
14:28
poikilotherm
Ok the Western like font type is "Clarendon Condensed"
14:28
poikilotherm
And the other one is "Myriad Pro"
14:28
poikilotherm
IIRC both by Adobe
14:28
poikilotherm
Found free supplements
14:45
donsizemore joined #dataverse
14:47
donsizemore
@poikilotherm just added you to our "jenkins" zone =) and yes allowed webhook ranges as specified at https://api.github.com/meta
14:47
poikilotherm
Nice!
14:47
poikilotherm
Thanks @donsizemore
14:48
poikilotherm
Anything we could do about re-enabling access for everyone?
14:49
donsizemore
the problem (at minimum) is that even with read-only access everything is served through the tomcat webapp, and bots managed to keep both CPUs pegged at 100% and exhaust its resources
14:50
donsizemore
better to focus on publishing test results to github (which i'm doing in the background)
14:50
donsizemore
or otherwise take a Bloxsom publishing model (scripts write flat HTML to serve publicly)
15:01
poikilotherm
donsizemore: are you seeing requests from single bots or more like a DDoS pattern?
15:07
Benjamin_Peuch
Thanks for the names of the fonts, poikilotherm.
15:08
poikilotherm
Asking because of NGINX options limit_req and limit_req_zone helping with rate limiting without fiddling with firewall rules
15:08
poikilotherm
Benjamin_Peuch: sure. No problem. Took me the help of whatfontis.com to find best matches ;-)
15:12
Benjamin_Peuch
Oh, I thought you knew because you had designed the badge?
15:13
Benjamin_Peuch
Since Philip thanked you personally.
15:14
donsizemore
@poikilotherm they also break page rendering
15:15
poikilotherm
Benjamin_Peuch: yeah, I did, but I used the official Dataverse logo for that. And the SVG containing it only has pathes, no font information left. So I had to reverse-engineer ;-)
15:15
Benjamin_Peuch
Clever. :o
15:16
poikilotherm
Once I had the fonts, I could add the other text elements ;-)
15:16
stefankasberger
@all: Short question regarding solr upgrading: We need to update Solr from 4.6.0 to 7.3.1. Would you recommend to upgrade, or to do a fresh new install? I have no experience with Solr so far, so I don't know in detail how it works inside, and together with Dataverse.
15:19
poikilotherm
Stefan let me send you a few links regarding solr upgrades
15:20
poikilotherm
https://lucene.apache.org/solr/guide/7_7/major-changes-from-solr-5-to-solr-6.html#major-changes-from-solr-5-to-solr-6
15:21
poikilotherm
https://lucene.apache.org/solr/guide/7_7/major-changes-in-solr-7.html#major-changes-in-solr-7
15:21
poikilotherm
Depending on the number of datasets, you might be better of doing a complete reindex with Solr 7
15:21
poikilotherm
s/of/off/
15:22
poikilotherm
It does take a while, but you might end up doing it anyway
15:23
donsizemore joined #dataverse
15:23
poikilotherm
Kevin mentioned in https://github.com/IQSS/dataverse/pull/6631#issuecomment-585931103 that it takes ~18h to reindex Harvard, which is ~6k datasets IIRC
15:24
donsizemore
@poikilotherm @stefankasberger doesn't have a huge number of datasets. the simplest thing is to do a clean install of 7.3.1 and reIndexAll
15:25
poikilotherm
stefankasberger: what donsizemore says :-D
15:26
poikilotherm
donsizemore: regarding breaking page rendering: wouldn't those bots be blocked by nginx rate limiting before they reach tomcat?
15:27
donsizemore
@poikilotherm yes, but to impose any effective limit on the bots also prevents browsers from loading all the various icons on jenkins views
15:27
poikilotherm
O.O
15:28
donsizemore
(which is to say, yesterday I tried a limit of 3 requests/second, then removed the limit)
15:28
donsizemore
and to make the limit effective against bots the limit would need to be much more stringent
15:28
poikilotherm
What about adding a cache?
15:29
poikilotherm
So the icons etc would be served from cache, not Tomcat
15:31
poikilotherm
I heard lots of good things about Varnish
15:36
donsizemore
nginx was blocking the icons (requests per second)
16:56
stefankasberger joined #dataverse
17:04
juliangautier joined #dataverse
17:06
juliangautier
Hi everyone! I used vagrant up to get a copy of Dataverse on my laptop, but can't figure out what the default username and password is. Would anyone here know or have any guesses? I've tried admin, admin1 and dataverseAdmin in all sorts of combinations :)
17:10
stefankasberger
Thanks @poikilotherm and @donsizemore. Will do the ReIndeAll with our 140 datasets. :)
17:11
poikilotherm
juliangautier: it should be user "dataverseAdmin" and password "admin" or "admin1"
17:11
poikilotherm
stefankasberger: great! :-)
17:14
juliangautier
poikilotherm: Thanks! I'll try that out
17:35
stefankasberger joined #dataverse
18:46
donsizemore joined #dataverse
18:47
donsizemore
@juliangautier if you used vagrant, look in tests/group_vars/vagrant.yml for dataverse.adminpass (probably "admin1" as @poikilotherm says)
19:01
juliangautier
donsizemore: Thanks!
19:25
stefankasberger joined #dataverse