Time
S
Nick
Message
00:05
djbrooke joined #dataverse
06:18
andrewSC joined #dataverse
07:57
jri joined #dataverse
08:00
jri joined #dataverse
12:53
rebecabarros joined #dataverse
12:57
rebecabarros
good morning =). pameyer when I try upload through GUI or API , I get code 404 and none file is created in $UPLOAD/requests
13:35
pdurbin
rebecabarros: good morning. Do you feel like opening an issue about SELinux at https://github.com/sbgrid/data-capture-module/issues ? The README should probably indicate if SELinux needs to be disabled.
13:41
rebecabarros
It should. I will open a new issue over there describing that.
13:43
pdurbin
Thanks!
13:56
pdurbin
rebecabarros: I *think* we should focus on continuing to hit "ur.py" with curl to get it working.
13:57
rebecabarros
pdurbin: what you mean?
13:58
pdurbin
I mean I think you should continue trying to get this script to work (it calls ur.py with curl): https://github.com/sbgrid/data-capture-module/blob/master/ansible/roles/dcm/files/root/scripts/dcm-test01.sh
13:59
pdurbin
Does that make sense?
14:09
rebecabarros
Yes. Like, now I'm getting 'status:ok' from running dcm-test01, which judging from ur.py suppose to mean that everything worked, right? But the json file it is not created.
14:09
pdurbin
"status ok" sounds like good news to me :)
14:11
pdurbin
I'm looking at https://github.com/sbgrid/data-capture-module/blob/master/api/ur.py again
14:11
pdurbin
"dump to unique file"
14:11
pdurbin
rebecabarros: you're saying a file isn't being created?
14:12
rebecabarros
pdurbin: yes.
14:12
pdurbin
Is the file supposed to be created in /deposit/requests/ ?
14:13
rebecabarros
I mentioned before, I've tried to curl http://$DCM_SERVER/up.py. Gives me back 'status:ok' and file is created in /requests.
14:13
pdurbin
oh! so the file is being created!
14:14
rebecabarros
But only if I do this directly. Running trough dcm-test, for instance, doesn't created the file.
14:14
pdurbin
huh
14:18
pdurbin
curl -H "Content-Type: application/json" -X POST -d "{\"datasetId\":\"42\", \"userId\":\"42\",\"datasetIdentifier\":\"42\"}" http://localhost/ur.py
14:18
pdurbin
rebecabarros: what happens if you run that curl command above from your DCM server?
14:19
donsizemore joined #dataverse
14:21
rebecabarros
pdurbin: 'status:ok' but no file created
14:21
pdurbin
hmm
14:23
pdurbin
but if you do `curl http://localhost/ur.py ` a file is created?
14:25
rebecabarros
pdurbin: that's correct
14:26
pdurbin
What is the content of the file that's created?
14:27
rebecabarros
It's a empty json
14:29
pdurbin
rebecabarros: ok. Thanks. How are you feeling about all this? pameyer says he should have time to help later today. Over at http://irclog.iq.harvard.edu/dataverse/2017-11-09#i_59973 you and djbrooke talked about the roadmap for this rsync feature.
14:41
andrewSC joined #dataverse
14:45
pdurbin
donsizemore: mornin. Lots on interest in your Ansible playbook!
14:46
donsizemore
@pdurbin i see that. i wish i had lots of time to work on it! ;)
14:47
pdurbin
seems like a higher priority that rewriting the installer :)
14:47
pdurbin
Is there anything I can do to help? I don't really know Ansible.
14:49
rebecabarros
the flow as far as could understand is: Using upload in DVN should make some call to ur.py that would be response to create some json file in /request directory. Than this json file will be used by sr.py to allow the upload itself. I will wait for pameyer so he could explain what the json file in /requests has to look like.
14:52
rebecabarros
pdurbin: you mean, what I think about your perspectives for rsync feature? I'm excited that the plan is to allow both options to work side by side. That way Dataverse will be able to cover all possible scenarios with small and large files.
14:52
donsizemore
@pdurbin i think the root of his problems are a) ansible assumes a clean install, as dataverse's installation isn't idempotent. i can stick some semaphores in there to make the playbook idempotent, but it will likely lead to screwy glassfish states
14:53
pdurbin
rebecabarros: right. Except we don't call it "DVN" anymore. Now we call it "Dataverse". :) I mean, I think that's how it works. From the Dataverse perspective, Dataverse calls "ur.py" to make an "upload request" and then immediately calls "sr.py" for a "script request". sr.py returns a Bash script with rsync commands in it. Dataverse prsents this script to the user in the Dataverse GUI .
14:53
donsizemore
and b) i never coded it for Ubuntu/Debian. the Readme.md says CentOS 7 and means it
14:53
pdurbin
donsizemore: sorry, one sec
14:55
pdurbin
rebecabarros: let me try to be a little more clear about the current state of the rsync feature. The reason why it's documented in the Developer Guide rather than the Installation Guide is that this feature is highly experimental: http://guides.dataverse.org/en/4.8.2/developers/big-data-support.html
14:56
pdurbin
That is to say, I'm not surprised that the rsync feature doesn't "just work" for you because you are only the second person to try to get it working. The first to get it working is pameyer who is the author of the rsync (Data Capture Module) code.
14:58
pdurbin
rebecabarros: I'm extremely impressed by your tenacity, by how hard you are working on trying to get the rsync feature to work. But I'm wondering if you should write up your notes so far into an issue at https://github.com/IQSS/dataverse/issues (main Dataverse repo) and ask for more documentation (Installation Guide rather than Developer Guide).
14:59
pdurbin
This would (someday) mean that someone other than the author of the Data Capture Module would install it and independently verify that it's working as expected. It would go through QA, basically.
14:59
pdurbin
As part of the process the documentation would be improved.
15:00
djbrooke joined #dataverse
15:00
pdurbin
Making it easier for a customer like yourself to follow the documentation and have success setting up all the necessary components enable "big data support" (rsync).
15:01
pdurbin
Does that make sense?
15:01
pdurbin
I don't mean to discourage you from continuing to try if that's how you'd like to spend your time.
15:01
pdurbin
I think you have a lot to contribute in terms of opening issues to explain the problems you've had.
15:01
pdurbin
Once we know what the problems are, we can fix them or document workarounds.
15:02
pdurbin
I hope this is making sense. I think I'm done. :)
15:02
pdurbin
rebecabarros: what do you think?
15:19
pdurbin
djbrooke: mornin. I'm sort of trying to talk rebecabarros out of trying to get a Data Capture Module working until we've put it through QA. We only tested the mock DCM.
15:20
djbrooke
I'd defer that question to pameyer who said he would be on later today
15:21
djbrooke
and mornin
15:21
donsizemore joined #dataverse
15:22
pdurbin
That's fine. Without more documentation, the Data Capture Module is obviously very difficult to support.
15:23
rebecabarros
pdurbin: Don't worry. I understand that is still a experimental feature and I really appreciate how you guys are accessible and helpful at any time. And I agree with you, I was already thinking about summarize in a doc how everything went so far and the problems that I've faced with the propose of help you to know how improve documentation and stuff.
15:23
rebecabarros
The reason why I "insist" in try to get this done is because we really want to use Dataverse but we really going to need to support large files, it's our main scenario. Meanwhile I'm already thinking about options, so, for instance, I'm about to test how Dataverse will behave if I split a 100gb zip file and upload 10 small ones with 10gb. Although this would not be ideal.
15:24
rebecabarros
But I do understand your concerns and I understand that this takes time and that you have a lot of other features to worry about right now.
15:26
pdurbin
rebecabarros: you and pameyer have the same needs. His primary use case is large files, which is why he help us develop this new experimental feature. Someone like you coming along to try to get the feature working is exactly what I wanted. I'm just frustrated that I can't help more. I don't know enough about how the DCM code works.
15:35
pdurbin
rebecabarros: I see "big data" at https://www.bahia.fiocruz.br/cidacs/ when I run that page through Google Translate. :)
16:04
pdurbin
or even when I don't :)
16:04
pdurbin
'grandes bases de dados (“big data”)'
16:12
rebecabarros
pdurbin: Again, don't worry. You've being really helpful for me since the beginning, answered me all sort of questions and was always patient :) haha. I appreciate that. I wish I have more programming skills to help you guys out on development side of things, but I do not, so...
16:12
rebecabarros
pdurbin: yes, that's us!
16:13
pdurbin
rebecabarros: you are helping a lot by testing things. It's extremely valuable.
16:13
pdurbin
"We welcome contributions of ideas, bug reports, usability research/feedback, documentation, code, and more!" https://github.com/IQSS/dataverse/blob/develop/CONTRIBUTING.md
16:15
pdurbin
rebecabarros: did you say you might summarize in a doc? What kind of doc? A Google doc? An attachment on a GitHub issue?
16:15
djbrooke joined #dataverse
16:15
pdurbin
A Google doc might be nice if you enable comments.
16:21
rebecabarros
What do you think that is the best way?
16:23
pdurbin
If you don't mind creating a Google Doc, I think that would be best.
16:28
rebecabarros
Ok then. I will do that and I send the link later.
17:04
djbrooke joined #dataverse
17:04
djbrooke joined #dataverse
17:11
Thalia_UM joined #dataverse
17:11
Thalia_UM
Good morning! :)
17:13
pdurbin
hi Thalia_UM. Good morning. :)
17:13
Thalia_UM
A question philip
17:15
Thalia_UM
I want to consult some open web services, as I can do it already installed dataverse, for example, modifying some XHTML file or something similar. They told me that they will investigate to implement JSON and AJAX to consult web services.
17:16
Thalia_UM
I don't know how to implement it so that through the interface when I create a dataset, check out those web services.
17:17
pdurbin
What do the web services do?
17:17
Thalia_UM
any ideas
17:18
pdurbin
Can these web services be used by any installation of Dataverse?
17:18
Thalia_UM
it is only to consult names of people, institutions, data type (xml, pdf, docx, etc)
17:18
Thalia_UM
http://catalogs.repositorionacionalcti.mx/webresources/idioma/0/2
17:19
Thalia_UM
For example like that
17:19
pdurbin
What are some example user stories?
17:19
Thalia_UM
That is my question
17:20
Thalia_UM
That link is about language
17:20
pdurbin
"As a user, I want to create a dataset and pick from a list of authors." ... Something like that?
17:20
Thalia_UM
Yes
17:20
Thalia_UM
Like that
17:21
pdurbin
Are there any other user stories?
17:24
pdurbin
"As a user, I want to..."
17:28
Thalia_UM
I don't understand what does mean user stories ?
17:29
Thalia_UM
are five web services that we are going to consult but we want that be dynamic with dataverse
17:31
pdurbin
A user story begins with "As a user, I want to..."
17:33
pdurbin
Thalia_UM: can you please create an issue for the first user story we just talked about? At https://github.com/IQSS/dataverse/issues
17:35
Thalia_UM
Oooh
17:35
Thalia_UM
yes
17:35
Thalia_UM
Sure
17:35
pdurbin
Thanks!
17:44
Thalia_UM
https://github.com/IQSS/dataverse/issues/4282
17:45
Thalia_UM
Do you have any idea how I can do that?
17:49
pdurbin
Thalia_UM: please see the comment I just left. Thanks for opening an issue!
17:50
pdurbin
djbrooke: Thalia_UM could probably use some help breaking her ideas down into user stories
17:52
Thalia_UM
djbrooke?
17:53
jri joined #dataverse
17:57
djbrooke
Hey Thalia_UM - Mike Cohn has written a few books about user stories and is my go-to source. A short read is here: https://www.mountaingoatsoftware.com/agile/user-stories
17:58
jgautier joined #dataverse
17:58
djbrooke
When we develop a feature or new capability, we want to recognize the user's goal in their words (in a consistent format)
17:59
djbrooke
This helps us as we develop because we can always point back to user's desired outcome, and it gives some flexibility about how we implement a solution to that outcome
17:59
dataverse-user joined #dataverse
18:00
djbrooke
So, for the example in 4282: As a user, I want to create a dataset and pick from a list of authors or language or type of publication, etc.
18:01
djbrooke
It's good! The only thing missing is the end piece - the "why" - what value would this provide to you or your user community?
18:05
Thalia_UM
We have to consult web services and then add to the "Add Dataset" form so that when consulting the web services "GET" the fields are filled with the content of the web services.
18:16
djbrooke joined #dataverse
18:17
djbrooke joined #dataverse
18:23
Thalia_UM
Another one of my questions is if I can do this but without modifying the dataverse code, without having to uninstall it.
18:38
djbrooke joined #dataverse
19:00
djbrooke joined #dataverse
19:02
djbrooke joined #dataverse
19:33
djbrooke joined #dataverse
19:50
djbrooke joined #dataverse
20:02
djbrooke joined #dataverse
21:01
djbrooke joined #dataverse
21:07
djbrooke joined #dataverse
21:29
Thalia_UM left #dataverse
21:39
djbrooke joined #dataverse
21:52
djbrooke joined #dataverse
21:54
pameyer joined #dataverse
22:08
jri joined #dataverse
22:10
pameyer
rebecaarros: the information flow should roughly be: request to ur.py (from curl, test script or Dataverse) -> JSON file in /deposit/requests -> rq worker reads JSON file, creates transfer account and script, moves JSON file to /deposit/processed (and renames JSON file from PID to dataset_id); request to sr.py returns script (or 404 if the script is not generated)
22:14
pameyer
"status:ok" from ur.py should only be returned if the request has been processed by the request queue
22:15
pameyer
^ typo'd ; "status:ok" is upstream of the request queue
22:21
pameyer
if there's an empty JSON file resulting from calls to ur.py, then this is probably because the parameters aren't being passed correctly
22:25
pameyer
should be JSON encoded text in the POST body
22:25
djbrooke joined #dataverse
22:51
djbrooke joined #dataverse
22:58
pameyer joined #dataverse
23:01
djbrooke joined #dataverse
23:02
dataverse-user joined #dataverse
23:05
djbrooke joined #dataverse
23:11
djbrooke joined #dataverse
23:14
djbrooke joined #dataverse
23:15
djbrooke_ joined #dataverse
23:16
djbrooke joined #dataverse