Time 
            S 
            Nick 
            Message 
         
        
01:47  
     
 
xarthisius joined #dataverse 
 
        
01:47  
     
 
xarthisius joined #dataverse 
 
        
02:23  
     
 
sivoais joined #dataverse 
 
        
08:29  
     
 
jri joined #dataverse 
 
        
08:30  
     
 
poikilotherm joined #dataverse 
 
        
09:03  
     
 
stefankasberger joined #dataverse 
 
        
09:43  
     
 
MrK joined #dataverse 
 
        
10:00  
     
 
icarito[m] joined #dataverse 
 
        
10:00  
     
 
juancorr joined #dataverse 
 
        
10:00  
     
 
poikilotherm joined #dataverse 
 
        
10:00  
     
 
stefankasberger joined #dataverse 
 
        
10:00  
     
 
jri joined #dataverse 
 
        
10:00  
     
 
xarthisius joined #dataverse 
 
        
10:00  
     
 
bricas joined #dataverse 
 
        
10:00  
     
 
andrewSC joined #dataverse 
 
        
10:00  
     
 
pmauduit joined #dataverse 
 
        
10:00  
     
 
larsks joined #dataverse 
 
        
10:01  
     
 
poikilotherm joined #dataverse 
 
        
10:01  
     
 
juancorr joined #dataverse 
 
        
10:01  
     
 
icarito[m] joined #dataverse 
 
        
10:01  
     
 
Youssef_Ouahalou joined #dataverse 
 
        
10:01  
     
 
MrK joined #dataverse 
 
        
10:01  
     
 
pdurbin joined #dataverse 
 
        
10:04  
     
 
bjonnh joined #dataverse 
 
        
10:04  
     
 
JonathanNeal joined #dataverse 
 
        
12:11  
     
 
MrK joined #dataverse 
 
        
12:26  
     
poikilotherm 
Guten Morgen an alle :-) 
 
        
12:34  
     
pdurbin 
mornin' 
 
        
12:50  
     
 
donsizemore joined #dataverse 
 
        
12:50  
     
poikilotherm 
pdurbin: I can report that Google OpenID Connect works flawless 
 
        
12:51  
     
pdurbin 
phew 
 
        
12:53  
     
pdurbin 
I'm reading through your comments. I'm not particularly interested in a pull request that only removes star imports. 
 
        
12:54  
     
pdurbin 
I am wondering about next steps for this OIDC stuff though. 
 
        
12:54  
     
pdurbin 
What more is needed for https://github.com/IQSS/dataverse/issues/5974  ? 
 
        
12:55  
     
pdurbin 
Is the epic over? :) 
 
        
12:57  
     
poikilotherm 
That depends on what we want to achieve 
 
        
12:57  
     
poikilotherm 
We do have basic support now 
 
        
12:57  
     
poikilotherm 
But no groups 
 
        
12:58  
     
poikilotherm 
No custom claims/attributes 
 
        
12:58  
     
pdurbin 
oh 
 
        
12:58  
     
poikilotherm 
No refactored JSON  
 
        
12:58  
     
poikilotherm 
No verified email address support AFAIK 
 
        
12:59  
     
poikilotherm 
No good tests 
 
        
12:59  
     
pdurbin 
Should we add it to dataverse-ansible so we can play with it when we spin up Dataverse on EC2? I've never seen it working. 
 
        
12:59  
     
poikilotherm 
Sure, why not 
 
        
13:00  
     
poikilotherm 
I really love that new flexibility 
 
        
13:00  
     
poikilotherm 
Add providers with configuration, not code changes 
 
        
13:00  
     
pdurbin 
But is there some sort of test IdP we can use? This was the main thing I was trying to get across as I passed the pull request to QA... that we can't test it unless we have an IdP. And ideally there's a free one in the cloud we can use. Like https://samltest.id  
 
        
13:00  
     
poikilotherm 
For manual testing we can always use Google or similar 
 
        
13:01  
     
poikilotherm 
What I would like to see is automated testing 
 
        
13:01  
     
poikilotherm 
The basic support is ok with Google 
 
        
13:01  
     
poikilotherm 
But I do have mapping of groups and attributes in mind 
 
        
13:01  
     
pdurbin 
Is it possible to use the new OIDC provider with https://samltest.id  ? (Or is that crazy talk.) 
 
        
13:02  
     
poikilotherm 
Nope. 
 
        
13:03  
     
poikilotherm 
samltest.id is SAML only 
 
        
13:04  
     
poikilotherm 
For automated testing we could just use a small docker based container 
 
        
13:04  
     
poikilotherm 
Like keycloak 
 
        
13:05  
     
pdurbin 
ok, and what about for demos? 
 
        
13:06  
     
poikilotherm 
People could either use Google or the OIDC playground 
 
        
13:06  
     
poikilotherm 
https://openidconnect.net/  
 
        
13:08  
     
pdurbin 
Interesting. Maybe you should put https://openidconnect.net  in the dev guide under a future version of http://guides.dataverse.org/en/4.18.1/developers/remote-users.html  
 
        
13:09  
     
pdurbin 
If https://demo.dataverse.org  were to be powered by dataverse-kubernetes some day which login options would appear for people to try out? 
 
        
13:11  
     
poikilotherm 
Wouldn't that be up to you, depending on what you want to offer? 
 
        
13:11  
     
poikilotherm 
It's just a matter of configuration, right? 
 
        
13:13  
     
pdurbin 
I guess. But I don't know what's supported, what's automated. Let's say we spin in up fresh once a week (or once a month). Can we automate the setup of any many auth providers as possible? 
 
        
13:13  
     
poikilotherm 
Sure. 
 
        
13:14  
     
poikilotherm 
Dataverse on K8s is still dataverse 
 
        
13:14  
     
poikilotherm 
I don't have a job yet to load the provider JSON  files into Dataverse yet 
 
        
13:14  
     
poikilotherm 
But that will happen anyway :-D 
 
        
13:15  
     
pdurbin 
nice 
 
        
13:15  
     
poikilotherm 
It would be really cool to have a proper configuration option for all of this 
 
        
13:16  
     
poikilotherm 
Like store your configuration for the provider somewhere, but retrieve the client credentials from somewhere else because secrets... 
 
        
13:16  
     
pdurbin 
Oh, I should mention that https://demo.dataverse.org  is already a blessed Research & Scholarship Service Provider by InCommon, which should make automation easier. There is no need to exchange metadata as long as we use the same keys. 
 
        
13:17  
     
poikilotherm 
Great 
 
        
13:17  
     
poikilotherm 
For a future demo service, we could think about providing a demo IDM connected to different providers 
 
        
13:17  
     
pdurbin 
You can see it here: https://incommon.org/custom/federation/info/entity.html?entityID=https%3A%2F%2Fdemo.dataverse.org%2Fsp&technical=true  
 
        
13:17  
     
poikilotherm 
So we have a better showcase what we can do... 
 
        
13:17  
     
pdurbin 
Yes! Exactly! A better showcase over all. 
 
        
13:18  
     
pdurbin 
I forget if dataverse-kubernetes supports https://github.com/IQSS/dataverse-sample-data  or not. 
 
        
13:18  
     
pdurbin 
not yet: https://github.com/IQSS/dataverse-kubernetes/issues/66  
 
        
13:18  
     
poikilotherm 
Nope, it does not yet, because I need an API  key... 
 
        
13:19  
     
poikilotherm 
Or endpoint 
 
        
13:20  
     
pdurbin 
Can you use https://github.com/IQSS/dataverse-sample-data/blob/ca7eca8d93da42ca1735551001684b34cc9a6b6b/get_api_token.py  ? 
 
        
13:21  
     
poikilotherm 
Err... Don't you have to use an API  key for this already? 
 
        
13:21  
     
pdurbin 
nope 
 
        
13:21  
     
poikilotherm 
https://github.com/IQSS/dataverse-sample-data/blob/ca7eca8d93da42ca1735551001684b34cc9a6b6b/get_api_token.py#L5-L6  
 
        
13:22  
     
poikilotherm 
Ok so you just pass in an empty token? 
 
        
13:22  
     
pdurbin 
whoops, those lines can be deleted. you pass in a password (admin1) 
 
        
13:23  
     
poikilotherm 
I could for sure use that. 
 
        
13:23  
     
poikilotherm 
Would you do a refactoring? 
 
        
13:24  
     
poikilotherm 
It would be awesome not to have user and password hardcoded, but either use an env var and/or parameter 
 
        
13:24  
     
pdurbin 
oh, in that silly script? sure 
 
        
13:25  
     
poikilotherm 
Didn't I create a different version of the config or sth like that with such things? 
 
        
13:25  
     
pdurbin 
Please be advised that you have to enable :AllowApiTokenLookupViaApi like donsizemore and I did at https://github.com/IQSS/dataverse-ansible/pull/82  
 
        
13:25  
     
poikilotherm 
Someting back in my head... 
 
        
13:25  
     
poikilotherm 
Ah that undocumented and bad setting? 
 
        
13:25  
     
poikilotherm 
Didn't we two talk about that a while ago? 
 
        
13:26  
     
pdurbin 
I think it's documented. Yes, it is bad. :) 
 
        
13:26  
     
poikilotherm 
Here you go with that thing bakc in my head https://github.com/poikilotherm/dataverse-sample-data/blob/dockerize/dvconfig.py.sample  
 
        
13:26  
     
pdurbin 
docs here: http://guides.dataverse.org/en/4.18.1/installation/config.html#allowapitokenlookupviaapi  
 
        
13:26  
     
poikilotherm 
I could continue where I left of... 
 
        
13:27  
     
pdurbin 
Well, what's in focus right now? :) 
 
        
13:28  
     
poikilotherm 
Actually I was going to create docker images for 4.17, 4.18 and 4.18.1 
 
        
13:28  
     
pdurbin 
That sounds like a much higher priority. :) 
 
        
13:28  
     
poikilotherm 
So I could as well make a little stop and get that thing going 
 
        
13:28  
     
pdurbin 
Provides much more value. 
 
        
13:28  
     
poikilotherm 
Looks pretty easy 
 
        
13:29  
     
pdurbin 
gotta love that low hanging fruit 
 
        
13:29  
     
poikilotherm 
Would be a cool addition for a release of the docker images... ;-) 
 
        
13:29  
     
pdurbin 
Yeah. I should go look again at your milestones. I remember some cool stuff coming. 
 
        
13:29  
     
poikilotherm 
As I have no access to dockerhub iqss org: maybe you could create a repo for the docker image? 
 
        
13:29  
     
poikilotherm 
(And give me & dataversebot access to it?) 
 
        
13:30  
     
pdurbin 
I'm pretty sure you have full access. 
 
        
13:30  
     
poikilotherm 
I do? On Docker Hub? 
 
        
13:30  
     
poikilotherm 
Let me check 
 
        
13:30  
     
pdurbin 
If I'm wrong I can fix it. 
 
        
13:30  
     
pdurbin 
No one at IQSS pushes to it. :) 
 
        
13:31  
     
poikilotherm 
:-D 
 
        
13:31  
     
poikilotherm 
No I have no admin access to the Docker org 
 
        
13:31  
     
poikilotherm 
Just to my two repos 
 
        
13:32  
     
pdurbin 
Ah, you're right. You're a member. And someone who isn't at IQSS anymore is an owner. 
 
        
13:33  
     
poikilotherm 
:-D 
 
        
13:33  
     
pdurbin 
dataversebot is a member 
 
        
13:34  
     
pdurbin 
Ok, now you're an owner. 
 
        
13:34  
     
pdurbin 
And now I feel more comfortable removing the former IQSSer so that I'm not the only owner. :) 
 
        
13:35  
     
poikilotherm 
LOL 
 
        
13:35  
     
poikilotherm 
I'm honoured 
 
        
13:35  
     
poikilotherm 
-u 
 
        
13:35  
     
pdurbin 
please make it awesome 
 
        
13:35  
     
poikilotherm 
Yes Sir! 
 
        
13:36  
     
pdurbin 
You've already taken us a long, long way. 
 
        
13:36  
     
poikilotherm 
Oh while I look at the org page: what shall we do with that drunken ~~sailor~~ dead images? 
 
        
13:37  
     
pdurbin 
Are you talking about dataverse-glassfish and dataverse-solr? The old stuff? 
 
        
13:38  
     
poikilotherm 
Aye 
 
        
13:38  
     
pdurbin 
Hmm, do we mention them in the guides? 
 
        
13:38  
     
poikilotherm 
https://i.imgur.com/zyaqxap.png  
 
        
13:38  
     
pdurbin 
Yeah, they are mentioned all over http://guides.dataverse.org/en/4.18.1/developers/containers.html  
 
        
13:39  
     
poikilotherm 
Yeah that's what I have been looking at... 
 
        
13:39  
     
pdurbin 
So let's rewrite that "containers" page first (probably from scratch). And *then* delete that old cruft. 
 
        
13:39  
     
poikilotherm 
Right 
 
        
13:39  
     
poikilotherm 
Ok back to samples. 
 
        
13:40  
     
poikilotherm 
Any preferences about the image name? 
 
        
13:40  
     
pdurbin 
nope 
 
        
13:40  
     
poikilotherm 
iqss/sample-data-loader? 
 
        
13:40  
     
poikilotherm 
iqss/sample-loader? 
 
        
13:41  
     
poikilotherm 
iqss/deploy-sample-data? 
 
        
13:41  
     
pdurbin 
iqss/dataverse-sample-data-loader? 
 
        
13:41  
     
pdurbin 
because there's more at IQSS than just Dataverse :) 
 
        
13:41  
     
poikilotherm 
Ok then let's stick with the repo name 
 
        
13:41  
     
poikilotherm 
iqss/dataverse-sample-data 
 
        
13:41  
     
poikilotherm 
That should be fine 
 
        
13:42  
     
poikilotherm 
Sounds good? 
 
        
13:44  
     
pdurbin 
ship it! 
 
        
13:50  
     
poikilotherm 
:-) 
 
        
15:02  
     
poikilotherm 
Almost there... 
 
        
15:03  
     
poikilotherm 
The uploading and publishing is pretty slow, isn't it? 
 
        
15:03  
     
pdurbin 
yeah, and it's only going to get slower as we add more data 
 
        
15:03  
     
pdurbin 
more diverse data hopefully 
 
        
15:04  
     
pdurbin 
from a variety of scientific fields 
 
        
15:04  
     
poikilotherm 
Guesses why it is that slow? 
 
        
15:04  
     
poikilotherm 
Uploading the real data is obviously limited to processing, but creating the datasets is slow, too 
 
        
15:05  
     
poikilotherm 
File ingest seems to be slow... Many many locks waiting 
 
        
15:06  
     
pdurbin 
Where are you running Dataverse? Within minikube on your laptop? 
 
        
15:06  
     
poikilotherm 
Aye 
 
        
15:07  
     
pdurbin 
Is it IO  bound? CPU  bound? Memory bound? 
 
        
15:07  
     
poikilotherm 
I'm not sure. I disbelieve IO , this is a fast SSD 
 
        
15:07  
     
poikilotherm 
The cluster has 4 GB  of RAM , should be OK too 
 
        
15:08  
     
poikilotherm 
So most likely CPU ... I gave the VM  2 CPUs and my laptop has only 2 CPUs 
 
        
15:08  
     
poikilotherm 
I dunno if ingest is using parallel processing at all 
 
        
15:08  
     
pdurbin 
Maybe some day we should work on https://github.com/IQSS/dataverse/issues/4201  :) 
 
        
15:10  
     
poikilotherm 
https://github.com/IQSS/dataverse-sample-data/pull/14#discussion_r354368459  
 
        
15:11  
     
poikilotherm 
Should we ping donsizemore on this and ask for his opinion? 
 
        
15:12  
     
pdurbin 
Well, I already created a read only team and added him to it. Then I clicked "request review" from him. 
 
        
15:12  
     
poikilotherm 
Making it the way I use it now results in me calling "API_TOKEN=`python get_api_token.py` python create_sample_data.py` 
 
        
15:13  
     
poikilotherm 
Great 
 
        
15:13  
     
pdurbin 
yeah, that's cleaner than what I was doing which was to manually update the config file 
 
        
15:15  
     
poikilotherm 
Jesus, that dataset with the "DE1_0_2008_Beneficiary_Summary_File_Sample_1.csv" files is taking AGES 
 
        
15:16  
     
poikilotherm 
Ah, its https://github.com/IQSS/dataverse-sample-data/tree/master/data/dataverses/cms/datasets/cmssampledata  
 
        
15:16  
     
poikilotherm 
Seems to be huge... 
 
        
15:18  
     
donsizemore 
@poikilotherm whut i do? 
 
        
15:18  
     
donsizemore 
(we're bring matthew into the fold today!) 
 
        
15:20  
     
poikilotherm 
donsizemore: pdurbin is complaining that I break dataverse ansible sample data loading... 
 
        
15:20  
     
poikilotherm 
See https://github.com/IQSS/dataverse-sample-data/pull/14  
 
        
15:20  
     
pdurbin 
poikilotherm: I said "Do not include files larger than 10 MB " at https://github.com/IQSS/dataverse-sample-data/blob/ca7eca8d93da42ca1735551001684b34cc9a6b6b/CONTRIBUTING.md  :) 
 
        
15:21  
     
poikilotherm 
pdurbin: OK. Shouldn't we remove that one from the default config? 
 
        
15:21  
     
poikilotherm 
or at least move them into a separate field? 
 
        
15:21  
     
poikilotherm 
Whatever 
 
        
15:22  
     
poikilotherm 
Meh. Error 500 
 
        
15:23  
     
pdurbin 
poikilotherm: well, each file is under 10 MB , right? So that sample dataset is not in violation of the rules. 
 
        
15:23  
     
poikilotherm 
OK 
 
        
15:23  
     
poikilotherm 
Looks like I should enable FAKE provider by default for K8s 
 
        
15:24  
     
poikilotherm 
Registration is failing... ;-) 
 
        
15:24  
     
poikilotherm 
That lead to a Error 500 
 
        
15:24  
     
pdurbin 
poikilotherm: on a related note, you might like the graphs donsizemore sent around yesterday about the amount of time it takes to ingest various sizes of files (more and more observations or columns) 
 
        
15:25  
     
poikilotherm 
I missed those... dataverse-dev mailinglist? 
 
        
15:25  
     
pdurbin 
it was non-SLOPI 
 
        
15:28  
     
poikilotherm 
pdurbin: CMS  used ZIP files... 
 
        
15:28  
     
pdurbin 
a loophole? :) 
 
        
15:28  
     
poikilotherm 
The unpacked data flowing through ingest is 10 files of ~15 MB  
 
        
15:29  
     
poikilotherm 
https://i.imgur.com/0cGOIVe.png  
 
        
15:37  
     
pdurbin 
meh, I think I'll allow it 
 
        
15:37  
     
donsizemore 
do we want a dataverse-ansible flag to raise the default upload filesize limit? 
 
        
15:37  
     
pdurbin 
donsizemore: couldn't hurt. What's the default? 
 
        
15:39  
     
 
MrK joined #dataverse 
 
        
15:42  
     
donsizemore 
oh, i had assumed 10MB 
 
        
15:42  
     
donsizemore 
in any case, we can make a configurable flag pretty easily 
 
        
15:42  
     
pdurbin 
meh, let's wait until it's a problem :) 
 
        
15:44  
     
poikilotherm 
pdurbin: you might be tempted to take a look at the new commit... 
 
        
15:46  
     
poikilotherm 
What would you like me to do about docs? 
 
        
15:46  
     
pdurbin 
Looking. And I don't think get_api_token.py has any docs. :) 
 
        
15:47  
     
poikilotherm 
Yeah. That's why I'm asking 
 
        
15:47  
     
poikilotherm 
Just add stuff to README? 
 
        
15:47  
     
pdurbin 
sure, if you feel like it 
 
        
15:48  
     
poikilotherm 
Feel like what? Letting down contribution quality? No way ;-) 
 
        
15:48  
     
pdurbin 
:) 
 
        
15:48  
     
pdurbin 
donsizemore: did you see the bit about the regex? poikilotherm is breaking our toys. :) 
 
        
15:48  
     
* poikilotherm 
giggles 
 
        
15:49  
     
poikilotherm 
/me feels like https://upload.wikimedia.org/wikipedia/commons/7/77/Rumplestiltskin_-_Anne_Anderson.jpg  
 
        
15:56  
     
pdurbin 
poikilotherm: do you want me to talk about your pull request at standup? It's in 20 minutes. Or should we move it to "community dev" and let you and donsizemore think about it more. And me. 
 
        
15:56  
     
poikilotherm 
Feel free to do as you please 
 
        
15:57  
     
poikilotherm 
Finishing touches to docs 
 
        
15:57  
     
poikilotherm 
THis still need the Jenkinsfile pipeline 
 
        
15:57  
     
poikilotherm 
And job 
 
        
15:58  
     
 
MrK joined #dataverse 
 
        
15:59  
     
poikilotherm 
https://github.com/poikilotherm/dataverse-sample-data/tree/13-dockerize#usage-in-automated-processes-without-api-key  
 
        
15:59  
     
poikilotherm 
Feel free to leave a comment on the PR ;-) 
 
        
15:59  
     
poikilotherm 
I'm outta here now... 
 
        
16:00  
     
poikilotherm 
My three steel uprights are arriving in ~20 minutes :-D 
 
        
16:00  
     
poikilotherm 
Read you tomorrow 
 
        
16:01  
     
pdurbin 
donsizemore: lemme know when you're ready for me to bring you up to speed on the regex thing. No rush. :) 
 
        
17:21  
     
donsizemore 
@pdurbin we're about to take matthew to lunch, and i see the links above, but... what broke? 
 
        
17:31  
     
pdurbin 
donsizemore: nothing broke. Something will break after we merge something. Some day. Please tell Matthew I said what's up. And enjoy lunch!