Time
S
Nick
Message
07:19
poikilotherm joined #dataverse
07:30
jri joined #dataverse
07:33
juancorr
Thanks @pdurbin I will begin to study the conexion between Dataverse and their external tools.
08:38
stefankasberger joined #dataverse
09:45
stefankasberger
@pdurbin: regarding checksums: i have read, that dataverse offers more checksum integrations than md5. we would like to change to sha256. is that reasonable? cause i think, also the filenames are ingested with the md5 hashes als filenames. does a switch to another checksum algorithm create issues, or is the filenaming seperated from the frontend display?
09:50
pdurbin
juancorr: there is no pull request yet, but here you can see the code that is being worked on: https://github.com/IQSS/dataverse/compare/3758-file-pg-preview
09:51
pdurbin
stefankasberger: I don't think there are any issues with switching to newer checksums.
09:53
pdurbin
poikilotherm: I tried to answer this Docker question but you might want to try as well: https://groups.google.com/d/msg/dataverse-community/RU-Y_JbYQIE/hy4J6bg7CgAJ
09:59
poikilotherm
Good morning :-)
10:00
* poikilotherm
goes looking
10:02
pdurbin
Thanks. And maybe you or stefankasberger can tell Slava about that issue.
10:02
poikilotherm
Uff. I don't think I'd be of much help there. Without any logs etc it's really hard to tell what potential issues he is suffering from
10:03
poikilotherm
Tell?
10:03
poikilotherm
You mean forward it to him?
10:03
poikilotherm
If "sharif" is going to open an issue in dataverse-docker I'm pretty sure Slava will see that
10:03
juancorr
Perfect @pdurbin. I will try it
10:15
kamil84 joined #dataverse
10:18
pdurbin
juancorr: thanks! Eventually we plan to work on https://github.com/IQSS/dataverse/issues/6210 but if you can help, that would be great!
10:19
pdurbin
poikilotherm: sure, I'm more thinking of the case where he doesn't open an issue. Time will tell, I guess.
10:20
poikilotherm
Aye
10:21
pdurbin
stefankasberger: a (possibly interesting) factoid is that in order to use the Data Capture Module, you have to switch from md5 to sha1 :)
10:26
poikilotherm
Oh, looks like I should change that by default in dataverse-kubernetes, right?
10:27
poikilotherm
MD5 and SHA1 are dead...
10:27
poikilotherm
RIP
10:27
pdurbin
Well, the DCM stuff is what https://github.com/IQSS/dataverse-kubernetes/issues/68 is about.
10:29
pdurbin
If you or stefankasberger feel strongly that Dataverse should use a specific checksum algorithm out of the box instead of md5 you could create an issue about this.
10:30
poikilotherm
Not necessarily.
10:30
poikilotherm
I saw lots of tools out there providing md5, sha1 and sha256 in parallel
10:30
poikilotherm
But that seems like a waste of CPU cycles
10:31
poikilotherm
MD5 is great on small boxes
10:31
pdurbin
:)
10:31
poikilotherm
Maybe one should add a section to the "run in production" part of the guide?
10:31
poikilotherm
A hint is fair I think
10:33
pdurbin
That "run in production" section was an afterthought I added at some point but I like it a lot.
10:34
pdurbin
poikilotherm stefankasberger: oh, did you see https://darus.uni-stuttgart.de ? It was mentioned at https://groups.google.com/d/msg/dataverse-community/nDMbMv4fKf4/P5YxHJzDBgAJ
10:36
poikilotherm
Yeah. They seem to be live now :-)
10:36
poikilotherm
Met Dorothea at Göttingen in summer
10:36
pdurbin
Not on the map yet but soon, I think. I emailed her. :)
10:36
* poikilotherm
*thumbs up*
10:58
poikilotherm
pdurbin donsizemore: I saw you guys talking about Jenkins the other day
10:59
poikilotherm
Or yesterday
10:59
poikilotherm
However
10:59
poikilotherm
You guys need help?
10:59
poikilotherm
I dunno if you took a look at my image jobs?
10:59
pdurbin
Well, I wasn't able to import a job. I think I'm missing some plugins.
11:00
poikilotherm
Ok
11:00
pdurbin
Part of the job were imported. Very strange behavior.
11:00
pdurbin
parts*
11:00
poikilotherm
I sounded like donsizemore was having troubles with configuring proper jobs for PRs
11:01
pdurbin
Well, I think his main trouble is that he was focusing on his actual job. :)
11:01
poikilotherm
LOL
11:01
poikilotherm
I hear you
11:01
poikilotherm
:-D
11:01
pdurbin
We agreed to look closer next week or so.
11:02
pdurbin
But at a high level I'm interested in spreading the knowledge of how to install Jenkins. How anyone can run their own https://jenkins.dataverse.org . Slava or stefankasberger or anyone.
11:02
poikilotherm
How much logic are you guys actually putting inside the jbs?
11:03
poikilotherm
Most of my stuff is living in Jenkinsfile(s) now, using declarative pipelines
11:03
poikilotherm
Because that seems the way to go these days
11:04
poikilotherm
Lots of the job configuration then boils down to pointing to the Jenkinsfile and connecting to the repo
11:04
pdurbin
I don't know. I'm new to all of this. I do know that Jenkins can be used within OpenShift to do builds, which I imagine is similar to what you're doing.
11:05
poikilotherm
Jenkins running on OpenShift or Jenkins running a job in OpenShift?
11:06
pdurbin
They seem to call it "openshift pipeline builds" and it uses Jenkins: https://docs.openshift.com/container-platform/3.9/dev_guide/dev_tutorials/openshift_pipeline.html
11:07
poikilotherm
Ah I see.
11:07
poikilotherm
They are combining a few things there
11:07
pdurbin
poikilotherm: are you using webhooks? Do you send a webhook to your Jenkins server? Do you have a Jenkins server? A webhook from GitHub, I mean.
11:07
poikilotherm
Running tasks in a Jenkin pipeline in an OpenShift cluster
11:07
jri joined #dataverse
11:08
poikilotherm
1) Yes, 2) Yes, but mine is jenkins.dataverse.org, 3) Using community server, 4) yes a github webhook
11:08
poikilotherm
It's managed by the github pr plugin :-)
11:08
poikilotherm
Works pretty smooth
11:09
pdurbin
Ok, what's the status of https://github.com/IQSS/dataverse-jenkins/issues/11 ?
11:09
poikilotherm
For now I have been working on adding the jobs via the UI
11:10
poikilotherm
As I needed to work out how to do it in the first place :-D
11:11
poikilotherm
I just looked at the webhooks of dataverse-kubernetes repo: I do have 2 webhooks.
11:11
poikilotherm
One is for pushes and pr, the other for comments
11:11
poikilotherm
That's because of the functionality that plugin offers, making dataversebot asking for verification
11:13
pdurbin
If I need to install my own https://jenkins.dataverse.org some day, how easy would it be for me to recreate your dataverse-kubernetes job(s)?
11:16
poikilotherm
IMHO it should be fairly easy. It's a multibranch pipeline job.
11:16
poikilotherm
Obviously it can't hurt to have some docs :-D
11:16
poikilotherm
Like what plugins to install etc
11:17
poikilotherm
The tough logic what happens INSIDE the job is version controlled in the Jenkinsfile
11:17
pdurbin
Docs on what to click in the GUI to add it? Or can you provide an XML file or whatever?
11:17
poikilotherm
It's more than just XML .
11:17
poikilotherm
You need to provide credentials
11:18
poikilotherm
Plugins can be installed from a list
11:18
poikilotherm
The job config can be saved as XML
11:19
poikilotherm
Hmm maybe it would be a good idea to have a look at persisting those steps as an ansible playbook
11:20
pdurbin
Have you seen import-job.sh at https://github.com/IQSS/dataverse-jenkins ? Do you want to add one for your stuff?
11:22
nils`` joined #dataverse
11:23
poikilotherm
Hmm its all about the other jobs. And its only importing the job config. No credentials, etc
11:23
poikilotherm
A playbook might be better to automate all steps
11:23
poikilotherm
Not extending the existing which installs Jenkins itself
11:29
jri_ joined #dataverse
11:30
stefankasberger
@pdurbin: would be great, to collect the metadata blocks in a github repo or so. i plan to have all config related files in a private repo, so i can use it for build/test activities. a master repo from iqss, developed and maintaned together, would be very helpful for that. maybe just one for metadata blocks to beginn with. i like granular repos for just one purpose as a foundation.
11:31
poikilotherm
I second that. The current mixture of code and all those metadata things would be great to be separated
11:32
poikilotherm
And we are going to create more metadata blocks on our way ahead.
11:32
poikilotherm
Oh and not to forget: Dorothea and some others are heavily engaged in a engineering metadata schema standard
11:33
poikilotherm
https://www.ub.uni-stuttgart.de/forschen-publizieren/forschungsdatenmanagement/projekte/dipl_ing/materials/output/MTSR2018_14_SchemberaIglezakis_EngMeta.pdf
11:58
jri joined #dataverse
12:01
stefankasberger
get that, but at least to have a list of metadata blocks already created, so you have orientation when you want to work one out for your own institution. and maybe later on then agree upon some standards. maybe. :)
12:15
pdurbin
poikilotherm: sure but imagine a world where Ansible doesn't exist. We would write a shell script, right? What would that shell script look like? :)
12:16
pdurbin
welcome nils``
12:16
poikilotherm
Ah shell scripts... The culprit of devops
12:16
pdurbin
stefankasberger: good idea. How do you edit your metadata blocks? The TSV files I mean. In vi? Using Google Spreadsheets? Something else?
12:17
pdurbin
poikilotherm: I'm just trying to unblock https://github.com/IQSS/dataverse-jenkins/issues/11 :)
12:19
Philipp joined #dataverse
12:19
pdurbin
poikilotherm: wow, so https://github.com/IQSS/dataverse-kubernetes/issues/65 is fixed?!? Great!
12:19
Philipp
Hi Phil!
12:19
pdurbin
Philipp: hello! Ready to make that pull request? :)
12:20
Philipp
Yes!
12:20
pdurbin
Perfect. Do you have Python 3 installed?
12:20
poikilotherm
Hi Philipp :-)
12:20
Philipp
Not sure. I probably have it on my linux machine, but I'm at work on windows now
12:20
poikilotherm
pdurbin: can we please upgrade to Solr 8? More sophisticated probes available :-D
12:21
pdurbin
poikilotherm: sure, please create an issue :)
12:21
Philipp
maybe it's easier if start my linux machien?
12:22
pdurbin
Philipp: uh oh. I'm a little worried I'm not supporting Windows so well in my Python script. Let me go look at it. update-data.py in https://github.com/IQSS/dataverse-installations
12:22
Philipp
give me 2min
12:23
Philipp
note: I'm not a programmer and its a while I took a python course...
12:23
pdurbin
Hmm, it should mostly work but I should probably change this... json_out = 'data/data.json' ... to change the / to something windows friendly.
12:24
Philipp left #dataverse
12:24
Philipp joined #dataverse
12:24
Philipp
back on linux now
12:25
Philipp
Phil?
12:27
pdurbin
Philipp: oh! I just pushed a commit to make it more Windows friendly. But you can try it on either. Up to you. :)
12:28
pdurbin
I don't have a Windows box so maybe you should test it for me.
12:28
Philipp
let's try on linux
12:28
Philipp
please guide!
12:28
pdurbin
Ok, first let's just get the code running. You'll want to clone the repo.
12:30
pdurbin
git clone https://github.com/IQSS/dataverse-installations.git
12:31
pdurbin
(We'll worry about forks and stuff later.)
12:32
Philipp
ah, now I remember why I gave this up last time. I've never cloned a repo. Do I just download the zip?
12:35
pdurbin
The zip is fine for now.
12:37
Philipp
OK. Downloaded. I'm in the terminal in the folder where the zip file is.
12:38
pdurbin
cool, now you need to unzip it
12:38
Philipp
is it unzip filename?
12:41
pdurbin
yeah
12:42
Philipp
done
12:44
pdurbin
Great. Now you should chance to that directory: cd dataverse-installations
12:45
Philipp
THere :)
12:47
pdurbin
Ok, now please try this: python3 update-data.py
12:48
Philipp
seems to work, at least I don't get any error message
12:50
Philipp
the file path disappeared for while and came back. OK?
12:50
pdurbin
Great! Was the data.json file updated? To include the stuff you changed or added? It should be in a directory called "data".
12:52
Philipp
hm what did I add/change??
12:53
pdurbin
"launch_year": "2017",
12:53
pdurbin
updated description
12:54
Philipp
yes, sorry, I have been at work sinde 0530...
12:54
Philipp
the stuff is updated!
12:56
pdurbin
Nice! From here you have a couple options. You could simply copy and paste the contents of data.json into whatever is on GitHub. Or we could get you set up git stuff (fork the repo, git commit, git push, etc.). What do you think?
12:59
Philipp
I think for now let's do the first one
13:00
Philipp
where should I paste it?
13:00
pdurbin
First you should click "edit" (the pencil icon) at https://github.com/IQSS/dataverse-installations/blob/master/data/data.json
13:01
pdurbin
Then there should be an editor in your web browser you can use.
13:02
Philipp
OK, do I replace the existing text with the copy from my local data.json file?
13:02
pdurbin
Yeah, that's what I would suggest.
13:04
Philipp
I entered this in the note field: Updated information about the DataverseNO installation.
13:04
Philipp
then I just click on Propose file change, right?
13:11
pdurbin
yep
13:14
Philipp
And now I make a pull request?
13:14
pdurbin
yes please
13:16
sharif joined #dataverse
13:16
Philipp
Done. Thanks! Now, I have to run! See you!
13:16
Philipp left #dataverse
13:16
sharif
hello
13:16
pdurbin
Hmm, I feel like someone other than me should merge it, ideally. Maybe bricas_ or jri or juancorr or poikilotherm or stefankasberger
13:17
pdurbin
Hi, Sharif. Thanks for joining us! :)
13:17
poikilotherm
Feel free to request a review from me
13:17
Guest88985
Hi pdurbin
13:18
pdurbin
poikilotherm: sure, please review https://github.com/IQSS/dataverse-installations/pull/13
13:18
pdurbin
poikilotherm: and when you're done perhaps you can explain to Sharif ( Guest88985 ) the difference between dataverse-docker and dataverse-kubernetes. :)
13:19
poikilotherm
pdurbin: i have no idea if those fields are correct
13:19
Guest88985
Pdubin, Thanks for replying my questions regarding curl command error
13:19
poikilotherm
do you have some kind of json schema?
13:19
poikilotherm
so this can be linted/validated
13:20
pdurbin
Hmm, nope, no JSON Schema yet.
13:20
Guest88985
pdubin, how to "ssh" into the Docker container?
13:20
pdurbin
Guest88985: good question. I always have to Google it. One sec.
13:21
Guest88985
Thank you
13:21
poikilotherm
No no no
13:21
poikilotherm
No SSH
13:21
poikilotherm
Please use "docker exec"
13:21
pdurbin
Yeah, that's the one. You might need to run `docker ps` first to get the name of the container. This might help: https://stackoverflow.com/questions/30172605/how-do-i-get-into-a-docker-containers-shell/30173220#30173220
13:22
poikilotherm
For convenience you can use docker-compose exec <service name> /bin/bash
13:22
poikilotherm
Thats faster than looking up the container from docker ps
13:23
poikilotherm
So this should be "docker-compose exec dataverse /bin/bash"
13:23
Guest88985
Thank you very much....let me try
13:23
pdurbin
poikilotherm: he's using dataverse-docker
13:23
poikilotherm
Obviously you need to be in the directory with your docker-compose- file
13:23
poikilotherm
Yeah. dataverse-docker is using docker-compose
13:24
poikilotherm
In dataverse-kubernetes you'd be using kubectl instead ;-)
13:24
pdurbin
poikilotherm: do you know why he's seeing the error he posted in https://github.com/QualitativeDataRepository/dataverse-previewers/issues/22 ?
13:24
Guest88985
I get this error ERROR: Couldn't connect to Docker daemon at http+docker://localhost - is it running?
13:25
Guest88985
when I run docker-compose exec dataverse /bin/bash
13:25
poikilotherm
Huh? Oh so you are on Windose?
13:25
Guest88985
no
13:25
Guest88985
Ubuntu
13:25
poikilotherm
Ok, can you please check docker ps?
13:25
Guest88985
it is on Virtualbox
13:25
Guest88985
Host Windows 10
13:26
poikilotherm
Ok. Just wondering why compose is trying to reach docker via http, not unix socker
13:27
poikilotherm
His error message sounds like a problem with the blocked api endpoint...
13:28
poikilotherm
For the container he might be seen as external (I don't have all details of dataverse-docker in my mind)
13:29
pdurbin
Yeah, me neither.
13:29
poikilotherm
pdurbin I really like the leaflet skin with that painted look :-D
13:30
pdurbin
me too
13:30
pdurbin
I was thinking about adding a dropdown so the user can change it though.
13:30
pdurbin
The new map has already appeared in some slides. I'm not sure if those slides are up yet, though.
13:31
poikilotherm
It would be pretty cool to have a BIG world map and smaller list
13:32
pdurbin
Yeah, it's all mashed into index.html right now.
13:32
poikilotherm
Maybe an overlay like Google maps is presenting
13:32
pdurbin
An overlay? The carosel at the bottom?
13:32
poikilotherm
Not necessarily
13:33
poikilotherm
Pretty sure you know the list of entries used by Google Maps when you search for something
13:33
poikilotherm
Like looking for a restaurant
13:33
poikilotherm
It gives you a nice collapsed list that narrows when zooming etc
13:34
kamil86 joined #dataverse
13:34
poikilotherm
And all those description text doesn't have to be readable right from the start, does it?
13:34
pdurbin
Oh, you want a dynamic list on the side that changes based on your zoom level and what part of the world you're looking at?
13:34
poikilotherm
But hey, you have the UX experts :-)
13:34
poikilotherm
At least I like the idea. Dunno if this works well
13:34
poikilotherm
Or is good UC
13:34
poikilotherm
UX
13:35
pdurbin
Works for restaurant listings, I guess.
13:35
poikilotherm
First of all a more or less full screen map would be awesome
13:35
Sharif_DE joined #dataverse
13:35
pdurbin
I've been thinking about how lots of projects could make use of a map like this.
13:36
Sharif_DE
pdurbin, i am back
13:36
Sharif_DE
I ran the command
13:36
Sharif_DE
sudo docker-compose exec dataverse /bin/bash
13:36
Sharif_DE
now i am in
13:36
pdurbin
Sharif_DE: getting disconnected is not your fault. You are probably suffering from https://github.com/IQSS/chat.dataverse.org/issues/3 :(
13:36
Sharif_DE
sudo docker-compose exec dataverse /bin/bash
13:37
Sharif_DE
root b6a03335f02f dv]#
13:37
kamil86
sorry to interrupt you, pdurbin could you give me some advice on the topic that was on http://irclog.iq.harvard.edu/dataverse/2019-10-15
13:37
Sharif_DE
should use curl command now?
13:37
pdurbin
Sharif_DE: now that you're in, does this work? curl http://localhost:8080/api/admin/externalTools
13:38
Sharif_DE
let me check
13:38
Sharif_DE
yes
13:38
Sharif_DE
it says OK
13:38
Sharif_DE
{"status":"OK","data":[]}
13:38
pdurbin
kamil86: hi! I would suggest calling Dataverse APIs in real time and letting us know if they are too slow. :)
13:39
pdurbin
Sharif_DE: perfect. Now please try adding a tool.
13:39
Sharif_DE
great
13:39
Sharif_DE
Thanks pdubin
13:39
Sharif_DE
let me try
13:39
stefankasberger
@pdurbin @donsizemore is anyone of you visiting europe for the EDDI at Tampere or European Dataverse Workshop at Tromso?
13:42
pdurbin
stefankasberger: I wasn't planning on it. It looks like Slava is giving a talk.
13:42
Sharif_DE
great @pdurbin.....now curl command runs
13:42
Sharif_DE
:)
13:42
Sharif_DE
Thank you sooooo much
13:43
Sharif_DE
Vielen Dank!
13:43
pdurbin
Sharif_DE: great! I would suggest opening an issue about the localhost error at https://github.com/IQSS/dataverse-docker/issues
13:44
pdurbin
stefankasberger: I think poikilotherm might go to one of them.
13:45
poikilotherm
Oi Sharif where are you from? /me working at Forschungszentrum Jülich...
13:45
kamil86
Thanks pdurbin, yes I'll write to him but wanted to firstly hear from you
13:45
pdurbin
stefankasberger: Geneviève Michaud is giving a presentation at EDDI and she tweets about Dataverse all the time. It would be nice to meet her some day.
13:45
poikilotherm
pdurbin: stefan and me might give both a talk in Tromso... ;-)
13:46
poikilotherm
stefankasberger & pdurbin: didn't phil mention that tanja schlater is going?
13:46
pdurbin
She is?
13:46
poikilotherm
let me check in the logs :-D
13:49
stefankasberger
Have heard yesterday, that John Crabtree is coming to EDDI, thats why I was asking about Don.
13:50
pdurbin
stefankasberger: ah, nice. I'm happy to travel to interesting places if people pay for my flights and hotel. :)
13:53
Sharif_DE
poikilotherm I am from Bonn
13:54
Sharif_DE
poikilotherm I have executed those command...but the files are not showing any difference
13:55
Sharif_DE
should I restart Dataverse?
13:57
pdurbin
Sharif_DE: what does http://localhost:8080/api/info/version show you?
14:01
stefankasberger
pdurbin: :) :) :) who not?!
14:12
Sharif_DE joined #dataverse
14:12
Sharif_DE
@pdurbin
14:12
Sharif_DE
i am back again
14:13
Sharif_DE
as I ran the command http://localhost:8085/api/info/version
14:13
Sharif_DE
please NOTE on port 8080 nothing comes out
14:13
stefankasberger
pdurbin: :) :) :) who not?!
14:13
Sharif_DE
but on port 8085
14:14
Sharif_DE
I got this info
14:14
Sharif_DE
version"4.15" build"221-9a0b627"
14:16
Sharif_DE
pdurbin, nothing changed on the file symbol though all the curl command got executed smoothly
14:32
pdurbin
Sharif_DE: ok. You might want to try some of these: https://github.com/QualitativeDataRepository/dataverse-previewers
14:34
Sharif_DE
Thanks pdurbin
14:34
Sharif_DE
I will try
14:35
Sharif_DE
pdurbin, i have executed all of them already
14:35
Sharif_DE
but nothing happened :(
14:38
pdurbin
Weird. Are you're trying various files? PDFs? Images?
14:39
Sharif_DE
i have so far used csv, dta, spss, txt
14:40
Sharif_DE
let me check with img and pdf files
14:41
pdurbin
ok
14:42
poikilotherm
Sharif_DE: wow cool :-)
14:44
poikilotherm
pdurbin stefankasberger: I mixed up - it was Merce
14:44
poikilotherm
http://irclog.iq.harvard.edu/dataverse/2019-09-18#i_105800
14:44
poikilotherm
Sry for that
14:44
pdurbin
poikilotherm: right, she's giving a keynote
14:44
Sharif_DE
it is ok pdurbin
14:44
pdurbin
no worries
14:44
pdurbin
Sharif_DE: it works?
14:45
Sharif_DE
just tried with jpg and png file
14:45
Sharif_DE
don't work either :(
14:45
pdurbin
Sharif_DE: do you see an "Explore" button?
14:46
Sharif_DE
no
14:46
pdurbin
hmm
14:46
pdurbin
and http://localhost:8080/api/admin/externalTools shows all the tools you added?
14:47
Sharif_DE
pdurbin for txt data
14:47
Sharif_DE
yes
14:48
Sharif_DE
there is an explore button
14:48
pdurbin
Oh! Good!
14:48
Sharif_DE
also for jpg
14:48
Sharif_DE
but not for spss data
14:48
Sharif_DE
http://localhost:8080/api/admin/externalTools yes
14:49
Sharif_DE
NOOOOOOOOOOOOOO
14:49
pdurbin
Was your spss data successfully ingested?
14:49
Sharif_DE
when I enter in the browser nothinh comes up
14:50
Sharif_DE
yes, spss was uploaded without any error
14:50
Sharif_DE
when I enter http://localhost:8080/api/admin/externalTools in the browser
14:50
Sharif_DE
nothing comes up
14:51
Sharif_DE
I get "Unable to connect"
14:51
Sharif_DE
if I change the port http://localhost:8085/api/admin/externalTools
14:52
Sharif_DE
I get SyntaxError: JSON .parse: expected property name or '}' at line 1 column 3 of the JSON data
14:52
pdurbin
Sharif_DE: here's an example of a file that was successfully ingested: https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/TJCLKP/3VSTKY&version=3.0 . Please note how the Download button is a drop down and shows "Original File Format, RData Format" etc.
14:54
Sharif_DE
I see....in my case there are no options
14:54
Sharif_DE
:(
14:54
Sharif_DE
How do I correctly ingest data?
14:55
pdurbin
Well, it it possible to make that spss file public? Can you upload it to a GitHub issue?
14:56
Sharif_DE
sure
14:56
Sharif_DE
I can
14:56
pdurbin
Cool, we always appreciate real world files to test with.
15:00
Sharif_DE
Can I add the SPSS file right there?
15:00
pdurbin
If you can, please do.
15:00
pdurbin
If not, maybe you could put it on Dropbox or whatever.
15:00
Sharif_DE
ok, thabnks
16:01
Slava41 joined #dataverse
16:14
poikilotherm
Slava has been here.
16:15
poikilotherm
Maybe we should create a graffiti
16:16
pdurbin
:)
16:16
poikilotherm
"It lasted 10 minutes"
16:17
poikilotherm
Solr gave me a hard time... :-/
16:17
pdurbin
Oh, that reminds me, I think I got an email that he messaged me on Skype.
16:21
poikilotherm
He. There's a nice idiom in german... "von hinten durch die Brust ins Auge", which is not very translatable. It's a relative colloquial way of saying someone is doing sth. in a roundabout, complicated way.
16:21
poikilotherm
Mailing via Skype seems... complicated...
16:22
pdurbin
It's a reminder Skype sends you to remind you to log into Skype.
16:22
pdurbin
I guess I could turn it off.
16:22
poikilotherm
Ah! Ok that makes more sense...
16:22
poikilotherm
As I already had people send me SMS to my mobile via Skype, maybe they added emailing, too
16:38
andrewSC joined #dataverse
16:46
poikilotherm
https://github.com/IQSS/dataverse-kubernetes/releases/tag/v4.16
16:47
pdurbin
poikilotherm: nice
16:47
pdurbin
I saw that was in focus for you. Good job.
16:51
poikilotherm
Thx for commenting on https://github.com/IQSS/dataverse-kubernetes/issues/108
16:52
pdurbin
Oh, sure.
16:52
poikilotherm
I just posted a re - talking about you, stealing and personas ;-)
16:52
pdurbin
Yeah, I'll go give credit to Janet.
16:52
poikilotherm
:-D
16:52
poikilotherm
Yes please
16:55
nils`` joined #dataverse
17:01
pdurbin
done
17:01
poikilotherm
Aye :-)
17:02
poikilotherm
Have you seen https://github.com/IQSS/dataverse-kubernetes/milestone/2?
17:02
poikilotherm
Anything in there that could be postponed? Something you wish for?
17:04
pdurbin
Can we add https://github.com/IQSS/dataverse-kubernetes/issues/84 ?
17:04
poikilotherm
Oh. About that.
17:05
poikilotherm
Are you guys going to use Maven for those?
17:05
pdurbin
I'd like to. Right now we have a pile of scripts and docs.
17:06
pdurbin
I could definitely use some help if you're volunteering. :)
17:06
poikilotherm
I'm asking because I wonder if it makes sense to talk about using things like Testcontainers for integration tests. One could of course use a Kubernetes cluster, but using Docker containers for dependent containers might be much easier for your tests
17:06
poikilotherm
It will only require Docker on devs laptops to run those
17:07
pdurbin
I'd like the tests to be run in the cloud. When the pull request first comes in, at least.
17:08
poikilotherm
"The cloud" is a very big term
17:08
poikilotherm
Do you mean "on Jenkins CI"?
17:09
pdurbin
Well, right now https://jenkins.dataverse.org/job/IQSS-dataverse-develop/ spins up an EC2 instance to run the API test suite.
17:12
pdurbin
But we should back up and talk about how to get the IQSS devs to start using dataverse-kubernetes. Recently another dev (besides me) started using dataverse-ansible by way of datavese-sample-data.
17:15
pdurbin
So the question is, can we add some easy instructions to the README to dataverse-sample-data to try using dataverse-kubernetes as an alternative to dataverse-ansible?
17:15
pdurbin
Does that make sense?
17:16
poikilotherm
Sure. What do you want users of dataverse-sample-data do with dataverse-kubernetes?
17:17
poikilotherm
Just loading the data and tryout Dataverse?
17:17
poikilotherm
Do real dev stuff?
17:17
pdurbin
dataverse-sample-data is mostly for demos.
17:17
poikilotherm
Ok.
17:18
pdurbin
Of the latest release, or the develop branch, or some branch I'm working on.
17:18
pdurbin
(any branch anybody is working on)
17:18
poikilotherm
Right.
17:19
pdurbin
I'm just wondering what a good hook would be, if you will.
17:19
poikilotherm
So it would be a good first step to have that stuff loaded into k8s?
17:19
poikilotherm
Absolutely
17:19
pdurbin
Something to get people here excited about dataverse-kubernetes.
17:19
pdurbin
How many steps to get it running on AWS?
17:19
poikilotherm
I just added https://github.com/IQSS/dataverse-kubernetes/issues/66 to milestone 4.17
17:20
poikilotherm
Should we create a Docker container for dataverse-sample-data?
17:21
pdurbin
Well, this is the part of the README I'm thinking about: All of the steps above can be automated on an fresh installation of Dataverse on an EC2 instance on AWS by downloading ec2-create-instance.sh and ec2config.yaml and executing the script with the config file like this
17:21
pdurbin
I guess what I'm saying is... what if we had a script like ec2-create-instance.sh but for dataverse-kubernetes?
17:22
poikilotherm
Yeah.
17:22
poikilotherm
The first step would be to create a Docker container with this
17:23
poikilotherm
And ideally make dvconfig.py eat environment variables
17:23
poikilotherm
Because that's a lot easier with Docker and Kubernetes
17:23
poikilotherm
At least for base_url and api_toke
17:23
pdurbin
Hmm. What's the first step for https://github.com/IQSS/dataverse-kubernetes/issues/68 ?
17:24
pdurbin
(Maybe DCM is a better hook.)
17:24
poikilotherm
I have absolutely no idea :-D
17:24
poikilotherm
DCM is a bigger thing than sample-data
17:24
poikilotherm
Much bigger
17:25
pdurbin
yeah
17:25
poikilotherm
And sample data might be cool for Slava, too
17:25
poikilotherm
Demos with data are always a great thing ;-)
17:25
pdurbin
yeah
17:25
poikilotherm
I want to have our Dataverse in Demo mode for next week... International Open Access week
17:26
poikilotherm
So that would be cool benefit for us, too
17:27
pdurbin
ah, there are some events planned here as well
17:38
poikilotherm
https://github.com/poikilotherm/dataverse-sample-data/blob/dockerize/Dockerfile
17:38
poikilotherm
Here you go
17:38
poikilotherm
That's the basic Dockerfile
17:39
poikilotherm
Building takes just half a minute or so
17:40
poikilotherm
226MB size (that's mostly the sample data)
17:41
poikilotherm
~110 MB base image, ~100 MB sample data, rest scripts and deps
17:42
pdurbin
Cool but I'm not following why it should be Dockerized.
17:42
poikilotherm
Because if you want it on K8s, the easiest way to go is just roll out an image and let it bark at Dataverse
17:43
pdurbin
Oh, ok, so it becomes a little app.
17:43
poikilotherm
Setup K8s, deploy dataverse, let it bootstrap and then run the sample data job
17:43
poikilotherm
It's easy as baking :-D
17:43
poikilotherm
paddi cake paddi cake
17:44
pdurbin
:)
17:44
poikilotherm
Ah it Patty Cake, Patty Cake
17:44
poikilotherm
+'s
17:44
poikilotherm
I would provide a Jenkinsfile to build it each time you push to master and push it to Dockerhub
17:45
poikilotherm
Modify the dvconfig.py script to eat env vars
17:45
poikilotherm
(in addition)
17:45
poikilotherm
And you're done
17:45
poikilotherm
Usable in dataverse-docker, dataverse-kubernetes and even for devs
17:46
poikilotherm
Because you can simply use docker run
17:46
poikilotherm
No venv, no install, just a quick container :-D
17:47
pdurbin
:)
17:49
poikilotherm
Alright, gotta go now. 10h are full.
17:49
poikilotherm
Read you tomorrow
17:49
poikilotherm
Maybe later if I can't get enough at home :-D
17:49
poikilotherm
I'm curious ;-)
17:49
pdurbin
o/
18:19
donsizemore joined #dataverse
19:04
poikilotherm joined #dataverse
19:11
poikilotherm
pdurbin: still around?
19:12
pdurbin
yep, chatting with Slava on Skype
19:12
poikilotherm
Nice :-)
19:12
poikilotherm
Greetings
19:12
pdurbin
donsizemore: hi, there was a question of if you're heading overseas to all these fancy conferences
19:13
poikilotherm
I scrolled through the API docs, but couldn't find anything: any option to receive an api_token via API?
19:13
pdurbin
poikilotherm: "Oliver does fantastic job on Kubernetes and we have all building blocks now"
19:13
donsizemore
@pdurbin you know, i volunteered to take one for the team, but i'm holding down the fort here
19:13
poikilotherm
donsizemore: Sorry to hear that :-/
19:13
donsizemore
@pdurbin (and giving two demos of impact/TRSA)
19:14
poikilotherm
pdurbin: "Thanks" ;-)
19:14
donsizemore
@poikilotherm I have a 76 year-old widowed mother to care for, so it's for the best
19:14
pdurbin
poikilotherm: that's another hook. Being able to demo impact/TRSA.
19:14
poikilotherm
Right...
19:15
poikilotherm
So many things to do :-D
19:15
poikilotherm
I'll open an issue for this
19:15
pdurbin
Great!
19:17
poikilotherm
https://github.com/IQSS/dataverse-kubernetes/issues/109
19:17
poikilotherm
pdurbin: I scrolled through the API docs, but couldn't find anything: any option to receive an api_token via API?
19:18
pdurbin
I may have removed it from the docs because I hate it.
19:18
poikilotherm
Wuahahahaha
19:18
poikilotherm
Well, I need it to grab the key ;-)
19:19
pdurbin
poikilotherm: see getApiTokenUsingUsername at https://github.com/IQSS/dataverse/blob/v4.17/src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java#L1754
19:20
poikilotherm
Ok
19:20
poikilotherm
No option via admin API ?
19:20
poikilotherm
That'd be easier, no fiddling with passwords...
19:20
pdurbin
Yeah, I removed it from the API Guide here: https://github.com/IQSS/dataverse/commit/fd743e284c2dc870aaac5377d9e613c3e61c7c29
19:21
pdurbin
See also http://guides.dataverse.org/en/4.17/installation/config.html#allowapitokenlookupviaapi
19:22
poikilotherm
Ok.
19:22
poikilotherm
Meh.
19:22
poikilotherm
Any chance we can create an admin api endpoint to load the data?
19:22
pdurbin
It was only for builtin users. Shib and OAuth users are left in the cold.
19:22
poikilotherm
That way we could simply skip all that shitty retrieval, which is bad anyway
19:22
pdurbin
I'd say go ahead and create an issue. Make your case.
19:26
donsizemore
@pdurbin mr phil sir.
19:27
poikilotherm
https://github.com/IQSS/dataverse/issues/6286
19:27
pdurbin
donsizemore: mr don my liege, what news?
19:28
poikilotherm
pdurbin: done. see above ;-)
19:28
donsizemore
API question. I can list contents of a dataverse, which includes the protocol, authority, identifier, and persistentUrl of each dataset. (i can construct what i need from that)
19:28
pdurbin
poikilotherm: thanks! This is related: https://github.com/IQSS/dataverse-ansible/issues/80
19:29
donsizemore
@pdurbin but i'm trying to help our newest archivist and if we could get the API to spit out the persistentIds owned by the dataverse, we could do what she needs in 3 lines of code.
19:30
donsizemore
@pdurbin I ask because the contents API spits out the persistentUrl but every other endpoint wants the persistentId
19:30
pdurbin
donsizemore: will the Search API help? The Search API is limited to published data.
19:31
poikilotherm
Linked
19:32
pdurbin
cool
19:33
poikilotherm
Shall I open a PR in dataverse-sample-data for what I have so far?
19:33
pdurbin
The Dockerfile?
19:33
donsizemore
@pdurbin we can work around it, but is it worth an issue requesting that persistentId be added to http://guides.dataverse.org/en/latest/api/native-api.html#show-contents-of-a-dataverse ? 'cause that's what Dataverse wants everywhere else. I may be able to conscript a certain Java programmer into this in the near future
19:34
poikilotherm
I also added some getenv() calls in dvconfig.py.sample
19:35
poikilotherm
donsizemore: I see "you must specify the “alias” of a dataverse or its database id" there
19:35
pdurbin
donsizemore: another thought is to try http://guides.dataverse.org/en/4.17/api/sword.html#list-datasets-in-a-dataverse but it only shows direct children
19:36
poikilotherm
But the other API calls use the same endpoint.
19:36
poikilotherm
Shall I have a look at the code if this is true for all the other calls, too
19:36
poikilotherm
?
19:36
donsizemore
@pdurbin I just realized contents spits out "id" — long day ;)
19:36
pdurbin
I guess "contents" from the native API also only shows direct children. Trees are hard. :)
19:37
pdurbin
donsizemore: SWORD spits out DOIs
19:37
donsizemore
@pdurbin it does, but it wanted user proxy authentication -- it didn't like my API token
19:38
pdurbin
Yeah, "contents" is much more open, for published stuff anyway: https://dataverse.harvard.edu/api/dataverses/open-source-at-harvard/contents
19:39
pdurbin
Are you ok with direct children only or do you need all the chillins?
19:39
poikilotherm
Haroo! I finally got sandbox credentials from ORCID
19:39
pdurbin
poikilotherm: nice
19:42
donsizemore
@pdurbin I think they're all published and they're definitely all the direct child of one dataverse. we're good, I was just grousing around for persistentId
19:43
pdurbin
ah, ok, yeah, that seems to be the field you want