IQSS logo

IRC log for #dataverse, 2020-02-05

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
07:41 juancorr joined #dataverse
08:01 jri joined #dataverse
10:13 Benjamin_Peuch joined #dataverse
10:59 pdurbin andrewSC Benjamin_Peuch bjonnh bricas jri juancorr pmauduit poikilotherm: if we got some Google Summer of Code students, what projects should we have them work on? Jim is asking for ideas by noon today: https://groups.google.com/d/msg/dataverse-community/WhbkdML6Jbs/fF2XCVHvEgAJ
11:09 pdurbin Six hours from now. :)
11:19 Benjamin_Peuch Issue 1753, of course!
11:19 Benjamin_Peuch https://github.com/IQSS/dataverse/issues/1753
11:20 Benjamin_Peuch (Even though I doubt Google will find that compelling)
11:21 Benjamin_Peuch Just watched your FOSDEM talk, pdurbin. Great presentation. Very well structured.
11:30 pdurbin Thanks! Much appreciated! Also, I just left a comment on 1753. :)
11:32 pdurbin My first thought for a GSOC student is browser-based automated tests using Cypress or Selenium.
11:38 pdurbin I don't think Google cares much about what we ask students to work on. :) From their perspective, all of it is probably in the weeds. :)
11:45 andrewSC GSoC is big this year
11:45 andrewSC Arch Linux is submitting as well
11:45 andrewSC I'm not sure tbqh
11:46 poikilotherm I forwarded it to my boss
11:47 pdurbin Bigger than usual?
11:53 andrewSC pdurbin: just interesting to see projects in my sphere talking about it when I don't recall it being mentioned in years prior?
11:54 andrewSC Mozilla iirc usually has something going on every year with GSoC too but that's a whole other thing lol
11:59 pdurbin Yeah, I think this is the first year we've considered it. And it was news to me, actually, when that email went out yesterday.
12:00 pdurbin I just created this "2020-02-05 Dataverse community board aggregation" spreadsheet: https://docs.google.com/spreadsheets/d/1ZpgtePPSkHxACYsW_60Ur2aQg2Q2qrJwWAHcVdKBztI/edit?usp=sharing
12:00 pdurbin Benjamin_Peuch poikilotherm: this is just an updated version of the one I showed you on Friday
12:27 Benjamin_Peuch pdurbin: I thought it looked familiar.
12:28 Benjamin_Peuch Thanks for the comment on 1753! I replied.
12:28 Benjamin_Peuch (Imma start learning issue IDs by heart, like poikilotherm.)
12:29 pdurbin Heh.
12:29 pdurbin Nice comment.
12:30 pdurbin I just posted all the issues that have 2 or 3 "votes" from an installation of Dataverse: https://groups.google.com/d/msg/dataverse-community/WhbkdML6Jbs/VeDdK5UfEwAJ
12:38 Benjamin_Peuch I appreciate that, especially since I try to vote every now and then.
12:38 Benjamin_Peuch But people don't do it much in general, it seems to me.
12:38 pdurbin Well, it's new.
12:39 pdurbin And I haven't been promoting these reports very much. And I haven't been making them regularly. Maybe poikilotherm can help me add them to Jenkins so they run nightly. :)
12:40 Benjamin_Peuch Let's see, Google, "how to make voting bot github"...
12:41 pdurbin Oh, have you not met https://github.com/dataversebot ? :)
12:54 Benjamin_Peuch A robot with a gender!
12:55 Benjamin_Peuch It's a bot account, like coveralls?
12:55 pdurbin poikilotherm: maybe it should be "Dr."
12:57 pdurbin Yeah, poikilotherm knows the details.
13:00 Benjamin_Peuch I second "Dr." Or "Master". Or "Supreme Intergalactic Ruler".
13:19 poikilotherm pdurbin Benjamin_Peuch : I'm back...
13:19 poikilotherm pdurbin: ad reports: sure. It's easy. Just setup a triggered job. LOL
13:19 poikilotherm pdurbin Benjamin_Peuch if you feel like it should go without gender, sure.
13:19 poikilotherm I was citing the series "Mr. Robot" ;-)
13:20 Benjamin_Peuch Domo arigato Mr Roboto!
13:20 Benjamin_Peuch I thought I'd be the local SJW.
13:23 poikilotherm Done. "Dataverse Robot Kitten"
13:23 poikilotherm Everyone loves Kittens
13:23 poikilotherm Maybe I should change the avater
13:23 Benjamin_Peuch <3
13:23 poikilotherm s/e/a/
13:28 jri joined #dataverse
13:32 jri_ joined #dataverse
13:38 pdurbin Here's the kitten my daughter and I drew: http://blog.greptilian.com/2020/01/03/learning-inkscape/
13:44 Benjamin_Peuch Awww
13:49 donsizemore joined #dataverse
13:51 donsizemore @pdurbin knock knock?
13:51 pdurbin donsizemore: talk to me
13:51 donsizemore (if IRC used avatars I'd need to make mine Beverly Archer at this point)
13:52 donsizemore if Thu-Mai uploaded >200GB of data yesterday, and we have two ~20GB files left to go, but the web proxy times out on gigabit switches even when i bump the apache timeout to 20 minutes
13:53 donsizemore am I safe having her create a 0-byte dummy file of the same name, which i can then manually overwrite, or should i upload the files and call curl directly to upload the files via glassfish?
13:53 donsizemore Matthew isn't in yet so I can't pester him
13:54 donsizemore helping her upload and ingest this particular dataset is part of why i couldn't help with dataverse-installations yesterday
13:55 pdurbin We use the zero byte trick from time to time. I just mentioed it to Peter yesterday actually. Don't forget to update the md5 in the database.
13:55 pdurbin And no worries about dataverse-installations. It's all up to date now, thanks to Sherry.
14:01 donsizemore @pdurbin i was looking at the datafileintegrity endpoint but it's looking like i'll need to update manually
14:33 pdurbin maybe you could run that endpoint before and after to see if it still works :)
14:54 pdurbin donsizemore: not sure if you saw my chatter earlier but I'm encouraging installations to "vote" if you will, on what Google Summer of Code projects we could propose. Jim is asking for ideas by noon (two hours from now for folks here in other time zones): https://groups.google.com/d/msg/dataverse-community/WhbkdML6Jbs/VeDdK5UfEwAJ
15:00 donsizemore @pdurbin i'm delighted to see some of my tickets among the list =)
15:06 poikilotherm Thank Benjamin_Peuch for your comment on #1753
15:06 pkiraly joined #dataverse
15:06 poikilotherm I started to write a comment, too, but you finished earlier :-D
15:06 poikilotherm So my comment kinda overlaps ;-)
15:06 Benjamin_Peuch My pleasure, poikilotherm.
15:07 Benjamin_Peuch Feel free to add to it, though. I think this is really a crucial point.
15:07 Benjamin_Peuch Oh, you did!
15:07 poikilotherm I couldn't emphasize it more
15:08 poikilotherm Yeah
15:15 pkiraly pdurbin: I have a question. Yesterday you mentioned MD5 hash, and the Dataverse is also stores MD5 hash when the file is uploaded via the user interface. The documentation about the DCM mock (http://guides.dataverse.org/en/4.19/developers/big-data-support.html?highlight=big%20data#steps-to-set-up-a-dcm-via-docker-for-development) however give an example of shasum.
15:15 pkiraly Are they interchangable or it is a mistake?
15:15 poikilotherm MD5 is default
15:16 Benjamin_Peuch I think you raise good points, poikilotherm. A former jurist colleague of mine told me there was a very specific kind of branch of law for databases.
15:16 poikilotherm http://guides.dataverse.org/en/latest/installation/config.html?highlight=hash#filefixitychecksumalgorithm
15:17 Benjamin_Peuch IMHO a repository should ensure that their general terms of use/service state something along the lines of: "By depositing data in this repository, you agree to the following terms..."
15:17 Benjamin_Peuch Unless depositors must explicitly sign contracts.
15:19 Benjamin_Peuch10 joined #dataverse
15:19 Benjamin_Peuch10 Then the act of data deposit gains more weight juridically speaking.
15:20 pkiraly poikilotherm: thanks!
15:21 Benjamin_Peuch10 left #dataverse
15:22 Benjamin_Peuch joined #dataverse
15:24 Benjamin_Peuch That being said, I know that the "contracts are laws" principle is not universal.
15:24 Benjamin_Peuch It's present in Belgian law, but what of international law... Just like you, I'm not an expert.
15:25 Benjamin_Peuch But we have a legal expert in the SODA project team. She and her predecessor have worked and are still working on those questions.
15:35 pdurbin pkiraly: you have to switch from MD5 to SHA-1 to use DCM.
15:45 poikilotherm OK guys, gotta run...
15:45 poikilotherm Read you all tomorrow
15:47 pkiraly pdurbin: thanks! DCM doesn't work for me, I am following the hint you have me yesterday, just found out that DCM suggest sha.
15:48 Benjamin_Peuch Oh wow.
15:48 Benjamin_Peuch EUDAT is so sexy.
16:12 pdurbin pkiraly: not working? Should I summon pameyer?
16:14 pdurbin donsizemore: are we having our Jenkins meeting this afternoon?
16:25 juancorr joined #dataverse
16:44 juancorr Thanks pdurbin. My vote for 1753 too. I agree with Benjamin_Peuch, many of our researchers want attribution.
16:56 pkiraly pdurbin: I'll can sent some more details next week what is not working exactly, but now I am busy with the workaround (which is working).
16:57 pdurbin pkiraly: work arounds work! :)
16:57 pkiraly pdurbin: pardon my English...
16:58 pdurbin juancorr: cool. Please feel free to put it on your board. :)
18:35 donsizemore @pdurbin gustavo wanted to meet tomorrow?
18:35 donsizemore @pdurbin also, the API happily accepted the first of my 20GB files on localhost =)
18:36 pdurbin I have a meeting at 3:30 tomorrow. So if we end early, I'm good.
18:37 pdurbin I forget what the API does. It tells you the checksum matches or something?
18:39 donsizemore it spits out this: {"status":"OK","data":{"files":[{"description":​"nc_risk_200127_from165.bak","label":"nc_risk_2​00127_from165.bak","restricted":false,"director​yLabel":"RISK","version":1,"datasetVersionId":3​1275,"categories":["Data"],"dataFile":{"id":750​8973,"persistentId":"","pidURL":"","filename":"​nc_risk_200127_from165.bak","contentType":"appl​ication/octet-stream","filesize":20807671808,"d​escription":"nc_risk_200127_from165.bak","sto
18:40 donsizemore which indeed tells me the checksums match =)
18:42 pdurbin phew
18:46 pdurbin donsizemore poikilotherm: here's something new: https://github.com/GlobalDataverseCommunityConsortium/dataverse-test-suite
18:51 donsizemore @pdurbin yis i'm supposed to pop that into Jenkins as it fleshes out
18:52 pdurbin great!
18:52 pdurbin sprint planning in a few minutes. any issues we should estimate?
20:42 pdurbin poikilotherm: https://github.com/IQSS/dataverse/pull/6488#issuecomment-582602218
21:46 pdurbin donsizemore: still there? SMTP question
22:08 pdurbin Nevermind. I'm heading out. :)
22:43 poikilotherm pdurbin how much do you want me to do that review? I could try to squeeze in tomorrow morning before America wakes up again...
23:23 pdurbin poikilotherm: if you feel like it, please go ahead. I just didn't want QA to be bocked. But QA triggered a stack trace so there's probably time. :)

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.