IQSS logo

IRC log for #dataverse, 2016-11-22

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
02:24 djbrooke joined #dataverse
05:08 axfelix joined #dataverse
08:15 jri joined #dataverse
10:11 petrichor joined #dataverse
12:11 petrichor joined #dataverse
12:42 petrichor joined #dataverse
12:47 petrichor joined #dataverse
12:51 petrichor joined #dataverse
13:44 petrichor joined #dataverse
13:49 jri_ joined #dataverse
13:50 donsizemore joined #dataverse
13:53 petrichor joined #dataverse
14:05 djbrooke joined #dataverse
14:47 jri joined #dataverse
14:54 pdurbin donsizemore: congrats on "we got basic publishing working from Discovery Environment into Dataverse" http://irclog.iq.harvard.edu/dataverse/2016-11-18#i_45268
14:56 donsizemore @pdurbin Discovery Environment/iRODS => Condor => Docker => Dataverse =)
14:58 donsizemore @pdurbin it's at https://github.com/DFC-Incubator/dataverse-publisher/tree/demo-2016-11-21 - next they want to be able to say download all files in a given dataset directly into iRODS and a few other common tasks
14:58 pdurbin "This is an Java application that uploads (publishes) a local file to a pre-specified dataverse destination."
14:59 pdurbin donsizemore: interesting! Does it use https://github.com/IQSS/dataverse-client-java ?
15:00 pdurbin donsizemore: I forget if I ever told you that we use Condor at IQSS: http://projects.iq.harvard.edu/rce/book/batch-basics
15:01 donsizemore @pdurbin I know Mike and Akio have seen that project but they wrote the code; my contribution was cobbling it in via Condor/Docker and doing the test runs. (in essence, you want Akio or michael.c.conway@gmail.com)
15:03 donsizemore @pdurbin you didn't. is anyone related to Dataverse running that stack (and have they seen Discovery Environment?)?
15:06 pdurbin donsizemore: Condor is run by "Technology Services" on this org chart: http://www.iq.harvard.edu/files/iqss-harvard/files/iqss_org_chart_final_2.1.16.pdf
15:13 pameyer joined #dataverse
15:30 pameyer donsizemore: which condor?
15:32 pdurbin pameyer: do you use Condor too?
15:32 pameyer if you mean condor the job scheduler, sometimes
15:33 pdurbin yeah, that one
15:33 pameyer cool - I wasn't sure if there were multiple projects with the same name
15:38 pdurbin I haven't hacked on the Condor stuff in years. That was a past life. :)
15:53 pameyer hopefully I won't be giving you flashbacks - but I've been considering it as a possible "abstraction of computation" component
15:54 djbrooke joined #dataverse
15:56 pdurbin condor certainly does computation
15:56 djbrooke joined #dataverse
15:56 djbrooke joined #dataverse
15:56 pdurbin Historically, the data at IQSS has been so small that if you want to process it with Condor, you just download it in your R script or whatever.
15:57 pameyer Right - but in terms of how to describe workflows between related datasets, condor DAGs seem like they might have some potential
16:00 pdurbin Cool. I'm out of my depth, but cool.
16:01 donsizemore joined #dataverse
16:11 pdurbin pameyer: speaking of computation, I wonder if you'd be interesting in swinging by for a talk by http://codeocean.com
16:12 pameyer pretty picture - what is it?
16:13 pdurbin it's hard to find any info... here's a bit: https://tech.cornell.edu/programs/startup-postdocs/runway-companies
16:25 pameyer pdurbin: to your original question, maybe.  when?
16:25 pdurbin pameyer: I'll forward the email to you. I'm sure you'd be quite welcome.
16:47 djbrooke joined #dataverse
16:53 djbrooke joined #dataverse
17:11 pdurbin community call has started: https://docs.google.com/document/d/16P0feohPzPflDOtv9GMA-bAe3i3QdHvQRrnc6uPjX3Y/edit?usp=sharing
18:03 djbrooke joined #dataverse
18:30 djbrooke joined #dataverse
18:40 djbrooke joined #dataverse
18:43 djbrooke joined #dataverse
19:00 donsizemore joined #dataverse
19:33 djbrooke joined #dataverse
20:31 djbrooke joined #dataverse
20:41 djbrooke pameyer: just had someone visit from HBS who wants to use the DCM
20:42 pameyer cool - in what context?
20:42 djbrooke she has "millions" of small text files - SEC filings from 1950 to present-ish
20:43 pameyer DCM by itself, or DCM+Dataverse?
20:43 djbrooke DCM+Dataverse - she saw the roadmap item and it made her think of it
20:43 pameyer means I need to bump the upper bound for my load testing on number of files then
20:44 pameyer the user isn't on windows, are they?
20:44 djbrooke Is the correct answer "no?"
20:44 pameyer hopefully ….
20:45 djbrooke I'll check... forgot about that restriction. But she's a librarian and can definitely get her hands on a Mac
20:45 pameyer we've had some discussions about going from generated transfer scripts to more generic upload client; but if you can't run unix scripts then in its current form things aren't likely to work
21:01 pdurbin millions of SEC filings makes me think we should run them through Consilience to cluster them somehow :)
21:02 pameyer SEC filings have gotten a lot easier since they started requiring filings in machine-readable format
21:02 axfelix joined #dataverse
21:04 axfelix joined #dataverse
21:19 petrichor joined #dataverse
21:40 axfelix joined #dataverse
22:02 djbrooke joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.