IQSS logo

IRC log for #dataverse, 2020-02-14

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:00 jri joined #dataverse
00:58 jri joined #dataverse
01:16 pdurbin poikilotherm: yeah, good idea. I just left a comment at https://github.com/IQSS/dataverse/issues/6633#issuecomment-586049095
02:59 jri joined #dataverse
05:33 jri joined #dataverse
07:34 jri joined #dataverse
07:45 jri joined #dataverse
08:42 Benjamin_Peuch joined #dataverse
08:43 jri joined #dataverse
08:44 Benjamin_Peuch Hello everybody!
09:32 GitterIntegratio joined #dataverse
09:51 jri_ joined #dataverse
10:12 icarito[m] joined #dataverse
10:12 poikilotherm joined #dataverse
11:10 poikilotherm Hi Benjamin_Peuch
11:10 poikilotherm I don't think America is with us yet :-D
11:26 Benjamin_Peuch Yeah they're still asleep
11:26 Benjamin_Peuch Quick! Let's steal stuff
11:27 Benjamin_Peuch You are not alphabetically indexed in the Shout user list, poikilotherm
11:27 Benjamin_Peuch My OCD is not pleased
11:35 Benjamin_Peuch Ah, there you go
11:36 pdurbin coffee first
11:37 Benjamin_Peuch Hahaha
11:37 Benjamin_Peuch Get the java
11:38 pdurbin so much Java
11:45 pdurbin I'm looking at this message from Dorothea but I'm not sure how to help: https://groups.google.com/d/msg/dataverse-community/m7hHmnj0CwI/Aw7j2tB-BgAJ
11:52 pdurbin In a little over 3 hours there will be a webinar about the Data Curation Tool: https://groups.google.com/d/msg/dataverse-community/KZ0Sgz12a2o/e6L8tJW5AQAJ
11:55 pdurbin Pretty exciting to see a new translation of Dataverse: https://github.com/GlobalDataverseCommunityConsortium/dataverse-language-packs/pull/57
11:57 juancorr joined #dataverse
12:14 poikilotherm What I don't understand - is that Weblate stuff coming or not? Looks like they are handling things manually for now...?
12:24 poikilotherm Uh oh... Looks like sth (me?) crashes Jenkins... Jobs fail with OOM
12:27 pdurbin Which job?
12:28 pdurbin poikilotherm: I think weblate is blocked on getting a valid cert on the box. I have the ssh key but it's running in Docker and I'm confused. I'd be happy to send you the key if you'd like to take a look.
12:29 poikilotherm Where is it running?
12:30 poikilotherm On IQSS AWS?
12:30 pdurbin Some other AWS I think. GDCC AWS?
12:32 poikilotherm Oh. OK
12:33 poikilotherm You shouldn't put a cert into the container. You should put it on a volume, mount that and load the cert from there
12:33 pdurbin That's way beyond me.
12:35 poikilotherm Is there any kind of config management for this?
12:35 poikilotherm You could also just follow this guide: https://docs.weblate.org/en/latest/admin/install/docker.html#docker-container-with-https-support
12:36 pdurbin I have no idea. My understanding is that Don spun up the box and Slava configured it.
12:37 pdurbin One of them (I forget) generated a CSR and I minted them a cert.
12:46 pdurbin Anyway, that's my understanding of why things are still manual right now.
12:55 poikilotherm OK
12:55 poikilotherm Maybe next week I have some time to help
12:56 poikilotherm Currently I will tag v4.19 in dataverse-kubernetes and hopefully have a full working deployment ready
12:57 pdurbin Nice! Oh, there's a weird bug in 4.19. Cosmetic. Do you have 4.19 running? Check this out: https://github.com/IQSS/dataverse/issues/6641
12:59 poikilotherm OH ok
12:59 poikilotherm Yeah, I just finished testing and other things that I needed for 4.19
13:00 poikilotherm Now I can deploy 4.19 with OIDC support (necessary for us) and nice autocomplete fields (spamming our logs)
13:00 pdurbin logs being spammed is unfortunate
13:00 pdurbin Is there an issue for that?
13:01 poikilotherm Aye. Lemme go fishing for that issue...
13:02 poikilotherm There you go https://github.com/IQSS/dataverse/issues/6360
13:03 pdurbin Thanks, I added "Type: Bug" to it, at least.
13:06 poikilotherm Yeah, label usage is not very common at IQSS, isn't it?
13:07 pdurbin Once upon a time we had a plan.
13:07 donsizemore joined #dataverse
13:08 pdurbin Here was the plan: https://github.com/IQSS/redmine2github/tree/57d50bcf10952bcdd87174ac208e8dfebccec8ae/scripts/label_updates#label-namecolor-table
13:10 pdurbin See also https://github.com/IQSS/dataverse/issues/2594#issuecomment-182929841
13:10 pdurbin But all that is very old and has changed.
13:10 donsizemore @pdurbin i respectfully request http://mkweb.bcgsc.ca/colorblind/
13:11 pdurbin donsizemore: good idea
13:12 donsizemore @pdurbin kasha always chooses attractive color schemes here, and there are always two that appear the same to me
13:12 pdurbin donsizemore: hey do you know the status of the weblate cert. poikilotherm had ideas. ^^ mounting volumes, fancy stuff
13:12 donsizemore @pdurbin it's on my list but i haven't looked at it. i honestly haven't done much more with that VM than stand it up
13:13 pdurbin Right. And all I've done is ssh in and ask why Apache isn't installed.
13:14 donsizemore @pdurbin i thought everything was in docker?
13:16 pkiraly joined #dataverse
13:18 pdurbin Right. I have been lead to believe that some sort of web server is installed within a Docker image running on the machine.
13:20 poikilotherm pdurbin: docs read like that Slava uses docker-compose, so there will be a container running the reverse proxy
13:21 pdurbin ok
13:22 pdurbin pkiraly: hi. I have a half-written blog post to share with you
13:27 pdurbin poikilotherm: he does seem to be a fan of docker-compose
13:28 poikilotherm pdurbin: yeah. unfortunately
13:28 poikilotherm docker-compose is great for local dev etc
13:28 poikilotherm But not for running your things in production
13:28 pdurbin we use docker-compose for docker-dcm and it seems to work fine but yeah, that's dev stuff
13:36 pkiraly pdurbin: please send me the URL
13:37 pkiraly poikilotherm: my problem with Slava solution is that I can not restart it, only rebuild it. If there would be a way to restart it, I'd love it
13:39 pdurbin pkiraly: when I'm done. :) The short version is that I attended Lilly Winfree's talk at FOSDEM a week or so ago. This week she called in to the Dataverse community call. She's the product owner for Frictionless Data. She replied to use here back in June: https://github.com/IQSS/dataverse/issues/4747#issuecomment-499982746
13:41 pkiraly great! I am big fan of Frictionless Data concept. Coincidently I checked their website, and found that they move forward since I checked them last time.
13:42 pdurbin Yeah, they seem busy. From the notes: "Phil) Maybe an external tool that allows people to examine a potentially unclean file in Goodtables, fix it up, and redeposit it into Dataverse." https://groups.google.com/d/msg/dataverse-community/jm9hMvJTChU/vYJRWtyvAwAJ
13:42 pkiraly I mean coincidently I checked it yesterday. We have an paper reading group, and I suggested the paper to my colleagues
13:43 poikilotherm pkiraly: you can't restart the container? I'm puzzled, because that should be possible with docker-compose AFAIK.
13:48 pkiraly pdurbin: as far as I understand you can start it with "docker-compose up", but it not just simply start, it also does a lots of other things, including installation
13:48 pkiraly poikilotherm: sorry it was an answer to you
13:48 pkiraly poikilotherm: sorry, it was an answer to you
13:52 pdurbin pkiraly: have you played with Goodtables? Do you think it would make a nice external tool for Dataverse?
13:54 pkiraly pdurbin: I am not familiar with it. I can check and report next week.
13:55 poikilotherm Yeah, that is normal with containers. Most of the time the applications bootstrap when you fire them up for the first time.
13:56 poikilotherm As the state is held in volumes, the applications check on start if they are already setup
13:56 poikilotherm If so, they just start the services
13:58 pkiraly pdurbin: there is another thing I think would be useful in Dataverse: Karen Coyle, Tom Baker: "Design for simple application profiles" slides: http://swib.org/swib19/slides/04_coyle_design-for-simple-application.pdf, video: https://www.youtube.com/watch?v=5LRrlzvVNes&feature=youtu.be.
13:58 pkiraly They suggest a simple way to create "application profiles" which is very similar thing as the "custom metadata block" in Dataverse
14:00 pkiraly poikilotherm: could you suggest me commands with which I can stop and restart all containers within Dataverse docker realm (one, which does not install anything in the restart process, just starts the containers)?
14:03 poikilotherm https://docs.weblate.org/en/latest/admin/install/docker.html#upgrading-the-docker-container
14:03 poikilotherm That should do the trick
14:04 poikilotherm Please test in an ephemeral context first
14:07 pdurbin pkiraly: thanks, do you know about the Data Curation Tool webinar in an hour? http://irclog.iq.harvard.edu/dataverse/2020-02-14#i_118713 . You might like it since you're so into dataset quality.
14:13 poikilotherm pdurbin: quick question?
14:13 pkiraly pdurbin: thanks!
14:14 pdurbin poikilotherm: hit me
14:14 poikilotherm Looking at http://guides.dataverse.org/en/latest/installation/config.html#id148
14:14 pkiraly poikilotherm: thanks! I am checking it
14:14 poikilotherm Unfortunately, in most cases, the text file will probably be too big to upload (>1024 characters) due to a bug
14:14 poikilotherm After some digging I found https://github.com/IQSS/dataverse/pull/5269/files
14:15 poikilotherm So is this still a problem or are the docs wrong?
14:16 pdurbin Good question. I see https://github.com/IQSS/dataverse/issues/4652 was closed.
14:17 poikilotherm I see https://github.com/IQSS/dataverse/issues/4652#issuecomment-435466678
14:17 pdurbin poikilotherm: my guess is that the docs are now wrong. If you have time, please test it and open a pull request to remove that line if you think it's no longer accurate.
14:18 poikilotherm Should I try and open an issue if it works, so docs get corrected?
14:18 pdurbin sure, an issue at least please
14:18 poikilotherm Aye Aye Sir :-)
14:18 pdurbin a pull request should be pretty easy... just deleting a line from the docs, right?
14:18 poikilotherm Well it's not a matter of time, but necessity - we need it :-D
14:19 poikilotherm Yeah ;-)
14:19 pdurbin sure, Venki too, lots of people :)
14:19 pdurbin pkiraly: while you wait for the webinar, you could watch this: https://fosdem.org/2020/schedule/event/open_research_frictionless_data/ :)
14:22 pdurbin The best slide from that talk: https://i.imgur.com/4ymFFaU.png
14:25 pkiraly +1
14:31 poikilotherm pdurbin: looking good. Work perfectly
14:32 pkiraly poikilotherm: docker-compose pull did not solves it. When I issued "docker-compose up" it starts with installing everything
14:37 pdurbin poikilotherm: great news! Thanks for testing!
14:48 poikilotherm I opened an issue
14:48 poikilotherm But gotta un now
14:48 poikilotherm HAve a great weekend @all
14:49 pdurbin thanks! you too!
14:50 Benjamin_Peuch Enjoy, poikilotherm!
15:03 pdurbin The Data Curation Tool webinar just started: https://ocul.zoom.us/j/183591365
15:04 pdurbin I guess I'll take some notes here, if no one minds. :)
15:04 pdurbin built with Angular 7, Angular Material Theme
15:05 pdurbin 44 participants on the call, awesome
15:10 pdurbin 51 participants. demo time
15:15 pdurbin pkiraly: did you see my question? This: Can you please demo weighting or at least explain the use case or user story? I'm not very familiar with weighting.
15:20 pdurbin A good question: what permissions are required to use the Data Curation Tool.
15:22 pkiraly pdurbin: I saw your question
15:24 pdurbin donsizemore's boss just answered it :)
15:28 donsizemore @pdurbin sorry, i've been helping a workstudy here. but i am hearing jon's end of the conversation =)
15:30 pdurbin :)
15:40 pkiraly pdurbin: some afterthought - if the statistical calculation happens on the client side, the size of the file (or its representation) could not be larger than the RAM at the client
15:40 pdurbin pkiraly: the stats are calculated in Dataverse. They're in the DDI.
15:41 pkiraly pdurbin: When?
15:41 pdurbin donsizemore: I just emailed Jon a link to https://www.stata.com/help.cgi?dta_119 because without detailed docs like this, it's hard to support proprietary formats (like SAS).
15:41 pdurbin pkiraly: it's part of ingest
15:42 pdurbin pkiraly: I believe they go in this table: http://phoenix.dataverse.org/schemaspy/latest/tables/summarystatistic.html
15:42 pkiraly pdurbin: is it part of the standard Dataverse? So if I upload a CSV, all these statistics are calculated automatically?
15:43 pdurbin yes, if the CSV is successfully ingested
15:44 pkiraly pdurbin: oh, great! I did not know this feature.
15:45 pdurbin It looks like this in the HTML Codebook: Summary Statistics: StDev 58.5678744130351; Max. 385.0; Mean 21.463636363636343; Valid 110.0; Min. 0.0;
15:46 pdurbin if you go to https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/TJCLKP/3VSTKY&version=3.0 and click "Metadata" and then "Export Metadata" and then "DDI HTML Codebook"
15:48 pkiraly wonderful, I never clicked on that ;-(
15:49 pdurbin I'm thinking about giving a talk called "Hidden Features of Dataverse" :)
15:50 pkiraly pdurbin: count me as audience. One question: what does it mean: "Notes: UNF:6:RzKNHLnl6nX8wZ6r2lb7mg=="?
15:52 pdurbin UNF is a whole thing: http://guides.dataverse.org/en/4.19/developers/unf/index.html
15:54 pdurbin It's basically a checksum for tabular data.
15:55 pdurbin The idea is that it shouldn't matter if you upload a CSV or TSV. The UNF is the same even though the MD5s of those two files are (obviously) different.
15:58 pkiraly many thanks!
15:58 pdurbin sure, thanks for watching
15:58 pdurbin the video should be available
16:28 Benjamin_Peuch Have a nice weekend, everyone!
17:12 pdurbin donsizemore: for what it's worth, I'm having trouble with https://github.com/GlobalDataverseCommunityConsortium/dataverse-test-suite too.
18:43 donsizemore @pdurbin eh, I just followed the instructions.
18:48 pdurbin Tests:       10 failed, 1 passed, 11 total
18:48 pdurbin ✓ Logging in (5828ms)
18:48 pdurbin ✕ Basic search (17044ms)
18:48 pdurbin ✕ Using facets (17801ms)
18:48 pdurbin etc
18:49 pdurbin Slava's going to take another look.
18:49 pdurbin I wonder if it's possible to pick and choose which tests to run.
18:49 pdurbin If we can get just "logging in" to pass from Jenkins, that would be a win.
18:49 pdurbin And then we could troublehshoot the other ones down the road.
18:50 pdurbin Anyway, Slava really really wants us to try his stuff so I tried it. :)
18:53 donsizemore @pdurbin i merged your solr PR, re-running in vagrant and jenkins...
18:56 pdurbin I saw! Untested by me! Sorry! :)
18:57 pdurbin Kevin is out but maybe I'll push for merging https://github.com/IQSS/dataverse/pull/6651 so that we can see the test suite passes.
19:00 pdurbin merged
19:04 pdurbin ah maybe you already kicked off a build
19:04 pdurbin I see two running
19:04 pdurbin let's hope they both pass :)
19:18 pdurbin I'm looking at https://jenkins.dataverse.org/job/IQSS-dataverse-develop/349/consoleFull but I'm not sure why it failed.
19:23 donsizemore @pdurbin building the warfile failed
19:23 pdurbin hmm! that means I should go look at travis
19:25 pdurbin looks like building the war file just succeeded in job 350
19:25 * pdurbin munches popcorn
19:32 pdurbin Tests run: 116, Failures: 1, Errors: 0, Skipped: 4
19:32 pdurbin not bad!
19:32 pdurbin [ERROR]   FilesIT.test_008_ReplaceFileAlreadyDeleted:821 Expected status code <200> doesn't match actual status code <500>."
19:36 pdurbin huh, that 500 was from publishDatasetViaNativeApi
19:41 donsizemore API test suite is falling over in EC2 nowadays. lemme bump the instance size in group_vars on jenknins
19:41 donsizemore (the API test suite fails on my buildbox now as well; I ordered a new SSD to grease the wheels)
19:41 pdurbin if you want
19:41 pdurbin I'm seeing 39 of these in server.log which doesn't sound good: Internal Exception: org.postgresql.util.PSQLException: ERROR: deadlock detected
19:42 donsizemore yup
19:43 donsizemore i got them consistently in Docker way back when; we moved the testing stuff into EC2 to work around it
19:43 donsizemore re-running now in t2.xlarge
19:44 donsizemore @pdurbin you're supposed to be in Key West!
19:44 pdurbin I know I know the family just got home from school. Getting louder. :)
19:49 pdurbin The plan is to leave for the airport in two hours so maybe I should say goodbye now or at least soon.
19:49 donsizemore hey, Key West has wifi ;)
19:50 donsizemore and don't look now, but build #351 is chugging along nicely
19:50 pdurbin hey, for me it's a looooong weekend so please don't expect me back until monday feb 24
19:50 dataverse-user joined #dataverse
19:51 dataverse-user Can you restrict access to specific datasets in dataverse? In other words, provide access to some but not others
19:52 pdurbin dataverse-user: yes
20:04 pdurbin [WARNING] Tests run: 1121, Failures: 0, Errors: 0, Skipped: 6
20:05 pdurbin seems like bumping up to xlarge may have helped? but it's still yellow?
20:05 pdurbin anyway, I should head out
20:05 pdurbin again, I'll be back on mon feb 24... have a nice weekend everyone!
20:05 pdurbin left #dataverse
20:17 donsizemore @pdurbin API test suite still falling over in a t2.xlarge(!) [ERROR] Failures: ", "[ERROR]   MoveIT.testMoveLinkedDataset:240 Expected status code <200> doesn't match actual status code <500>.", "", "[INFO] ", "[ERROR] Tests run: 116, Failures: 1, Errors: 0, Skipped: 4",
20:23 poikilotherm pdurbin have nice vacation! Lucky you!

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.