IQSS logo

IRC log for #dataverse, 2017-11-09

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
02:53 andrewSC joined #dataverse
03:11 axfelix joined #dataverse
03:14 dzho joined #dataverse
04:22 sivoais pdurbin: sure, I wouldn't mind being added to the spreadsheet. But I haven't done anything with Dataverse yet --- not even started on implementing the SWORD protocol for Perl like you mentioned once ;-)
06:00 jri joined #dataverse
06:16 iqlogbot joined #dataverse
06:16 Topic for #dataverse is now Dataverse is open source research data repository software: http://dataverse.org | IRC Logs: http://irclog.iq.harvard.edu/dataverse/today | Who's who: https://docs.google.com/spreadsheets/d/16h3jv24usMGq18495C-JA-yNcQCKiKDa65MTraNDd7k/edit?usp=sharing
08:13 jri joined #dataverse
08:24 jri joined #dataverse
09:45 bjonnh joined #dataverse
11:10 donsizemore joined #dataverse
12:45 rebecabarros joined #dataverse
14:01 rebecabarros pdurbin1, pameyer: I'm running out of ideas, guys. I saw that '500 error status' could be related with folder permissions, so I set full access in my dcm folder. Again, nothing.
14:04 pdurbin1 rebecabarros: I think you should start hacking on ur.py itself
14:04 pdurbin does that make sense?
14:05 pdurbin or copy ur.py to hello.py and make it reply "hello world" when you hit it with curl
14:05 pdurbin Maybe this is a terrible idea but it's what I'd do. :)
14:07 rebecabarros It's an idea. Let me try that and see what happens. Thanks :)
14:09 donsizemore joined #dataverse
14:09 pdurbin rebecabarros: sure. Part of why I'm suggesting this is that pameyer mentioned yesterday that perhaps CGI scripts can't be executed.
14:10 pdurbin donsizemore: mornin'. You're a machine. Should we point you at a project other than slogging through a rewrite of the Perl installer into Python? Is there any other issue you're excited about?
14:11 donsizemore @pdurbin i'm excited about the translation stuff but the internationalization part will be more of a journey (and possibly part of a grant here)
14:11 pdurbin sivoais: I totally forgot about that. Added! What do you want your "org" to be?
14:11 pdurbin mmm, I can smell the grant money already
14:12 pdurbin the smell of sustainability
14:12 pdurbin donsizemore: your skills are probably wasted writing docs, right?
14:13 donsizemore @pdurbin i do like writing documentation. the syntax is much more forgiving than code
14:14 pdurbin yeah
14:14 donsizemore @pdurbin what in particular needs documenting?
14:14 pdurbin I wonder if I should change "Help Wanted: Code" to "Help Wanted: Java" and "Help Wanted: Python" and Javascript and Perl.
14:15 pdurbin Well, the API Guide is on my mind because of this thread that got kicked off yesterday: https://groups.google.com/d/msg/dataverse-community/4XsA0Px2H8Q/-iHLF-osDAAJ
14:15 pdurbin I'm sure a lot of people try using the API and give up. Not that I have any data on this.
14:17 donsizemore we've used the API for a few things but wound up doing most of the prep work in the GUI
14:17 pdurbin prep work?
14:17 pdurbin Here's a list of where documentation is lacking: https://github.com/IQSS/dataverse/labels/Help%20Wanted%3A%20Documentation
14:17 donsizemore pre-existing database, API token, dataset creation
14:18 pdurbin You take datasets from an old database and copy and paste the metadata into Dataverse using the GUI?
14:22 pdurbin If so, I don't blame you. Constucting an equivalent JSON document is a pain. I mention this is that thread on theh Google Group.
14:48 rebecabarros pdurbin: you're right. it's something related with ur.py file itself. When I've changed to an 'Hello World' example, I've got no errors from dcm-test01.sh. Now I have to found out what
14:54 donsizemore @pdurbin it was for the publishing app we wrote for DE, which expects an existing account, dataverse, and dataset
15:03 pdurbin rebecabarros: sounds like progress. Do you need any more suggestions at this point?
15:04 pdurbin donsizemore: ok. And it sounds like it was a partially manual effort, especially populating the metadata fields.
15:04 donsizemore @pdurbin correct, but we're interested in Dataverse for a couple grants we're on, and would love to make more use of the API
15:05 pdurbin sounds like a good opportunity to help with the API Guide, if you're interested :)
15:06 pdurbin Maybe we should start with a user story.
15:35 Thalia_UM joined #dataverse
16:01 Thalia_UM Good morning! :)
16:37 pdurbin mornin
17:01 rebecabarros pdurbin: I do. I'm still a little lost about the whole flow of DCM and how everything connects with everything. Do you have any more suggestion regarding ur.py?  I've checked if the diretory pointed out in the file exists and if one has permission to write and yes. I've checked if Redis is installed and up and running, and yes it is.
17:08 sivoais pdurbin: hmm, maybe put down Project Renard? That's the closest to what Dataverse is doing (nevermind the fact that I started it... :-P)
17:10 pdurbin DCM uses Redis? Huh.
17:11 pdurbin sivoais: fixed. Thanks.
17:36 jri joined #dataverse
18:00 Thalia_UM joined #dataverse
18:10 rebecabarros pdurbin: do you know from where this dump is supposed to come? https://github.com/sbgrid/data-capture-module/blob/master/api/ur.py
18:12 pdurbin It says "dump to unique file"
18:13 pdurbin rebecabarros: I'm going to guess something like "/deposit/requests/1234.json" from a quick look at the code.
18:18 axfelix joined #dataverse
18:19 rebecabarros As far I could understand this file is created by this script, right? Without any content? Or from where the content come from? from my rsync request? I'm trying to run the ur.py itself (which I think don't make much sense because something prior has to call for it). But when I debug the file itself, seems like the file can't be opened, much like because it does not exists.
18:21 rebecabarros Sorry if this don't make much sense.
18:21 pdurbin Huh, there is Redis in there. I forgot it was added to speed things up.
18:23 pdurbin rebecabarros: I'm calling for reinforcements. :)
18:24 Thalia_UM joined #dataverse
18:25 rebecabarros pdurbin: haha ok, thanks again!
18:26 pdurbin rebecabarros: do you know if Venki is still trying to set up a Data Capture Module or not? Remember that thread on the Google Group?
18:30 pdurbin Here's the thread: https://groups.google.com/d/msg/dataverse-community/mcji2ytn3QI/3qKoRkiYBAAJ
18:30 pdurbin rebecabarros: Do you want to reply on the thread with your latest status? Maybe Venki can help. Or maybe Pete will reply when he has time.
18:31 donsizemore joined #dataverse
18:33 rebecabarros pdurbin: I do remember. He sent me a private message asking if my 17gb successful uploaded file was a zip file. Seems like he was trying upload double zipped files. I said that mine was a csv file and I've suggested that he could try to upload using the API just to see what would happened. But I do not heard back from him.
18:34 pdurbin rebecabarros: hmm, ok. Do you think he's actively trying to set up all this DCM and rsync stuff?
18:36 rebecabarros pdurbin: I don't think so. But I could try to ask.
18:43 pdurbin rebecabarros: ok, you and Pete might be the only people trying to use a Data Capture Module. Do you see "Large Data Support and HTTP Upload Support" at https://dataverse.org/goals-roadmap-and-releases as something we're thinking about? I can explain what that means.
18:54 rebecabarros pdurbin: I would appreciate if you give me a general idea on how you are thinking to approach that.  :)
19:01 pdurbin rebecabarros: the key word is "and". Large Data Support AND HTTP (regular) upload at the same time. Right now you can only use one or the other. Does that make sense?
19:01 jri joined #dataverse
19:05 djbrooke joined #dataverse
19:06 djbrooke Hey rebecabarros -- pdurbin mentioned you had some roadmap questions... let me catch up on the chat
19:10 djbrooke joined #dataverse
19:11 rebecabarros pdurbin: that's a good news. Because I was worried about have to use only DCM even for small datasets.
19:13 djbrooke so, rsync and http upload are currently either/or - you need to pick to have your installation transfer data via rsync or http
19:13 djbrooke If you try to switch between them or enable both, I don't think it will work as expected
19:14 djbrooke We chose the either/or path first because it makes it easier from a UI/UX standpoint and it meets the grant requirement of making this available in support of the large data sets of structural biologists
19:16 djbrooke But, in 2018 we'll be planning to make an installation able to have both rsync (for big transfers) and http (for smaller transfers) in the same installation
19:16 djbrooke The technical groundwork has been laid, now it's a matter of providing a good user experience for getting data in and out Dataverse when both of these options are enabled
19:19 rebecabarros djbrooke: hi, thanks for the clarification. That's a good prospect. Over here we will have to deal mostly with large data sets but we also would like to be able to maintain http option for the smaller ones. I'm really excited for this.
19:21 pameyer joined #dataverse
19:22 pameyer rebecabarros: just to recap, you're seeing 500 error both local and from DV server on ur.py calls; nothing in lighttpd error.log
19:22 pameyer anything I'm missing from skimming the logs?
19:23 djbrooke and we're excited to work on it! It's been a long time coming, so it's great to have the resources from this grant for something that will benefit the larger community
19:26 rebecabarros pameyer: correct. Although I think that error.log doesn't show much info because is not setted to do so. And I did not find how to make it more verbose.
19:26 pdurbin pameyer: I was encouraging rebecabarros to copy ur.py to hello.py and try to get a "hello world" output via curl just to make sure CGI is working.
19:27 pameyer looking that up now
19:27 pameyer but leaning towards guessing the web server doesn't have write permission to the request directory
19:29 pameyer ah - got a better candidate
19:29 pameyer could you let me know what the default python version on your dcm system is?
19:30 rebecabarros pameyer: it's Python 2.7.5
19:33 pameyer ok - could you add `server.breakagelog = "/var/log/lighttpd/breakage.log"` to your lighttpd.conf and restart
19:33 pameyer python2.7.5 should be fine
19:36 rebecabarros pameyer: let me try
19:37 pameyer graphviz files in doc subdirectory were intended to be informative about information flow; but feedback has been that they don't do a great job conveying information
19:42 rebecabarros pameyer: So, I restarted lighttpd and try tu run dcm-test01 again. Here is the breakage.log -> https://pastebin.com/JVaEhHDR
19:42 rebecabarros I've change the DATADIR value in ur.py, should I move back to the value that is in your code?
19:44 pameyer what do you have it set to?
19:44 donsizemore joined #dataverse
19:45 pameyer log makes it look like a problem using a relative path; if you switch to a full path it should at least move the error
19:48 rebecabarros full path lead me to a permission error. I don't understand cause I already did chmod -R 777 on the main directory of dcm
19:53 pameyer you have DATADIR inside the same directory as the dcm code?
19:53 rebecabarros pameyer: yes
19:55 pameyer does `sudo -u lighttpd touch $DATADIR/testfile` give the same permission error
19:56 pdurbin pameyer: oh! I had a thought. SELinux.
19:56 pameyer aka - $DATADIR for the full path to your dcm/deposit/requests/14186.json directory
19:56 pameyer pdurbin: good thought. `sestatus?`
19:56 pdurbin getenforce
19:57 pameyer rebecabarros: you have `/usr/local/dcmu` in some places, and `dcm/deposit/` in others.  is it possible that there's a mix up in directory names
20:01 rebecabarros pamayer: I put $DATADIR = $UPLOAD_DIRECTORY/requests (of main.yml) should that be okay? Meaning, DATADIR should be any directory or what?
20:04 pameyer right: `$DATADIR = $UPLOAD_DIRECTORY/requests` is the way things are expecting to have it setup
20:07 rebecabarros `sudo -u lighttpd touch $DATADIR/testfile` works without error and the file is created.
20:07 pameyer ok - so it's not permissions
20:07 pameyer could you try that with the full path for $DATADIR?
20:11 pdurbin Can you both run `getenforce` and say what the output is?
20:12 rebecabarros pameyer: I've tried and same thing.
20:12 pameyer "same thing" == ("same failure as before" | file created)?
20:14 rebecabarros same failure as before
20:14 rebecabarros pdurbin: 'getenforce' gives me 'Enforcing'
20:16 pdurbin pameyer: does getenforce return 'Enforcing' for you too?
20:16 pameyer nope - permissive
20:16 pameyer so might be selinux breaking stuff again
20:17 pdurbin rebecabarros: you might want to try setting SELinux to permissive
20:18 rebecabarros pdurbin: guess I saw something related to that in two ravens tutorial right? I will try that.
20:19 pdurbin yep, `setenforce permissive` is in http://guides.dataverse.org/en/4.8.2/installation/r-rapache-tworavens.html
20:19 rebecabarros Right now I will have to go. But first thing tomorrow morning is try this with SELinux, and I'll let you guys know. Thank you again for all the help.
20:19 rebecabarros left #dataverse
20:20 * pdurbin crosses fingers
20:29 pdurbin pameyer: I appreciate you jumping in. Any thoughts while this is all top of mind?
20:33 pameyer "dcm" vs "dcmu"; selinux are top of the list
20:35 pdurbin ok. we probably should have put non-mock through QA
20:36 pdurbin given enough time, that is :)
20:42 pameyer yup
21:23 pdurbin left #dataverse
22:26 jri joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.