IQSS logo

IRC log for #dataverse, 2017-05-16

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
00:02 djbrooke joined #dataverse
02:34 djbrooke joined #dataverse
03:57 axfelix joined #dataverse
09:32 cdsp-rmo joined #dataverse
11:56 cdsp-rmo joined #dataverse
12:19 donsizemore joined #dataverse
12:30 andrewSC joined #dataverse
12:39 djbrooke joined #dataverse
12:41 pdurbin donsizemore: mornin', want to see something weird with Shibboleth DiscoFeed?
12:43 pdurbin welcome back, cdsp-rmo
12:44 djbrooke joined #dataverse
12:46 donsizemore @pdurbin absolutely =) maybe i can pick your brain on something afterward?
12:46 cdsp-rmo hello !
12:48 pdurbin donsizemore: sure, I'll trade you. :) I'm was seeing "SyntaxError: JSON.parse: unexpected non-whitespace character after JSON data at line 9179 column 1 of the JSON data" in Firefox (but not Chrome) at https://dataverse.harvard.edu/Shibboleth.sso/DiscoFeed a few minutes ago. Now the error is gone.
12:48 pdurbin I'm seeing "SyntaxError: JSON.parse: expected ',' or ']' after array element at line 3647 column 1 of the JSON data" at https://dataverse.unc.edu/Shibboleth.sso/DiscoFeed
12:49 pdurbin weird stuff, potentially concerning because the Dataverse application parses that JSON data
12:49 pdurbin I mean, I'm not going to lose sleep about it.
12:51 donsizemore @pdurbin this is the png line of Haute École Spécialisée de Suisse occidentale?
12:51 pdurbin dunno
12:52 pdurbin cdsp-rmo: ah, I'm happy to see this: https://github.com/IQSS/dataverse/issues/3809#issuecomment-301721442 :)
12:54 cdsp-rmo my answer or my pull ?
12:54 cdsp-rmo :D
12:55 pdurbin the answer that will lead to a pull :)
12:55 cdsp-rmo ^
12:56 pdurbin oh! there it is! https://github.com/IQSS/dataverse/pull/3833
12:56 pdurbin Travis is still running. I guess we'll see how good our tests are. :)
12:57 donsizemore @pdurbin jsonlint likes each of them atm
12:57 pdurbin donsizemore: maybe Firefox is being picky
12:57 cdsp-rmo hum
12:58 cdsp-rmo some troubles it seems :(
12:58 cdsp-rmo oh god
12:58 cdsp-rmo I forgot some ; it seems
12:58 cdsp-rmo from python to java is hard sometimes :o
12:59 pdurbin in javascript you can use or not use the semicolon depending on your mood :)
12:59 djbrooke joined #dataverse
13:01 cdsp-rmo I left some useless )
13:01 cdsp-rmo I may misread the error
13:04 donsizemore @pdurbin so, my off-the-top-of-your-head question? i'm sure you all have discussed it before
13:06 donsizemore thu-mai prepares to publish a dataset with 224 files in it. gears whir, white-gloved hands with knives start spinning. asynchronously, backgrounded. in the browser, thu-mai gets the spinning wheel of work
13:07 donsizemore she waits a while, tries to publish again, gets an error. i'm at lunch. ingest and indexing can take a while with large datasets, but any way to get good/meaningful feedback to the user until it completes? publishing-in-progress semaphore or some such?
13:08 pdurbin woof, that sounds like a terrible user experience
13:09 pdurbin donsizemore: can you please read through this issue about workflows? https://github.com/IQSS/dataverse/issues/3561
13:09 donsizemore well, either apache timed out on her the first time, or maybe the wheel was still spinning. come to think of it, the ingest-in-progress bit would've displayed after upload. so it's really while we're waiting on solr
13:10 pdurbin In short, we are preparing for when datasets will take a long time to publish due to large amounts of data needing to be shuffled around on disk. But we want this "workflows" concept to apply to other situations.
13:12 donsizemore just trying to find some helpful way to respond. will send #3561 to akio and jon.
13:13 donsizemore p.s. i snagged the last room at the Friendly Inn. Thu-Mai and Jon are staying at Irving House. food and drink recommendations welcome
13:13 cdsp-rmo it seems travis failed, but because my change works
13:14 cdsp-rmo :)
13:14 cdsp-rmo Failed tests:   testDatasetJson2dublincore(edu.harvard.iq.datave​rse.export.dublincore.DublinCoreExportUtilTest): expected:<...<dcterms:identifier​>[doi:]10.5072/FK2/PCA2E3</...> but was:<...<dcterms:identifier>[http://dx.doi.org/]10.5072/FK2/PCA2E3</...>
13:16 pdurbin donsizemore: no need to respond. I just wanted to put that issue on your radar. There's even code ready to review. Please see the comment I just left at https://github.com/IQSS/dataverse/issues/3561#issuecomment-301778552
13:17 pdurbin cdsp-rmo: I guess you should change the test to make it pass.
13:17 cdsp-rmo yup
13:30 cdsp-rmo everything seems fine
13:30 cdsp-rmo I go back to work, still available if needed ;)
13:40 pdurbin cdsp-rmo: thanks! I just moved https://github.com/IQSS/dataverse/issues/3809 to code review at https://waffle.io/IQSS/dataverse
13:50 djbrooke joined #dataverse
14:19 pameyer joined #dataverse
14:20 andrewSC joined #dataverse
14:24 pameyer donsizemore: that sounds more to me like a post-ingest slowdown; maybe post-solr as well
14:24 pdurbin yeah
14:25 pameyer we've seen some slowdowns in publishing with large numbers of files (during development/prototyping); seemed like these had a significant postgres bottleneck
14:26 pameyer if I'm remembering right (always questionable); time complexity for publishing scales roughly with (number of files) * (number of dataset users)
14:26 pdurbin hmm, I wonder if it's the indexing of permissions per user. would need to profile it
14:28 pameyer I *think* it was ~ for each file, notify each user "owning" the file
14:28 pdurbin pameyer: "Out of curiosity, what's the mechanism OSF uses for uploading datasets that are 1.5 TB in size?" -- me at https://groups.google.com/d/msg/openscienceframework/VgHCeUecSuc/5glX8iuzAgAJ
14:29 pameyer interesting - not something I know the answer to
14:29 pdurbin maybe it's easy
14:30 andrewSC joined #dataverse
14:31 djbrooke joined #dataverse
14:35 donsizemore joined #dataverse
14:37 donsizemore @pdurbin @pameyer here's the dataset in question, in case you'd like to do your own (smarter) testing: https://dataverse.unc.edu/dataset.xhtml?persistentId=doi%3A10.15139%2FS3%2F0CS5ZC
14:37 donsizemore @pdurbin @pameyer there are no unpublished files
14:37 pameyer @donsizemore: thanks
14:59 djbrooke joined #dataverse
15:01 djbrooke_ joined #dataverse
15:01 djbrooke_ joined #dataverse
15:02 djbrooke joined #dataverse
15:32 donsizemore joined #dataverse
15:33 djbrooke joined #dataverse
15:33 cdsp-rmo hum
15:34 cdsp-rmo @pdurbin ?
15:34 cdsp-rmo I'm trying to figure out if it is an intended design patern or not, so I am asking you
15:34 cdsp-rmo https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/29604
15:34 cdsp-rmo for example, on this dataset, we can see .tab files
15:34 cdsp-rmo when you select the "Download" button, this is a drop down menu available
15:35 cdsp-rmo the thing is, the menu is IN the files div, and is extending the lower part of the files list, doing some strange moves when you want to select the right option
15:36 cdsp-rmo (dunno if I'm clear, I am having troubles myself to express it)
15:39 cdsp-rmo file:///home/rmo/Pictures/Screenshot​%20from%202017-05-16%2017-38-39.png
15:39 cdsp-rmo woops, bad move
15:40 djbrooke joined #dataverse
15:40 cdsp-rmo http://hpics.li/c38a49c
15:40 cdsp-rmo better
15:50 djbrooke joined #dataverse
15:59 pdurbin cdsp-rmo: strange moves?
15:59 cdsp-rmo got on the given dataset
15:59 cdsp-rmo I linked
15:59 cdsp-rmo then do the same thing I am doing
15:59 cdsp-rmo download => Datafile citation
15:59 cdsp-rmo (the last option)
16:00 cdsp-rmo don't you feel something weird ?
16:01 pdurbin cdsp-rmo: it looks ok to me: http://i.imgur.com/f0Un8Y5.png
16:01 cdsp-rmo and if you do it with the last element ?
16:01 pdurbin cdsp-rmo: oh! Yes, the last element is a hot mess.
16:01 cdsp-rmo the menu is inside this div, instead of going on top (to me)
16:02 cdsp-rmo just a graphic detail
16:08 pdurbin cdsp-rmo: it's a usability issue. I created an issue. Please let me know what you think: https://github.com/IQSS/dataverse/issues/3835
16:09 cdsp-rmo yup, that's it
16:09 cdsp-rmo perfect
16:10 pdurbin cdsp-rmo: if you really need that file you should be able to get it from the API
16:10 pdurbin in various formats
16:10 pdurbin with the "format" parameter: http://guides.dataverse.org/en/4.6.1/api/dataaccess.html
16:11 cdsp-rmo no, I don't need it
16:11 cdsp-rmo actually, we encountered the thing with in our dataverse
16:11 cdsp-rmo but given that the dataset wasn't published, I couldn't show it for the example
16:11 cdsp-rmo so I picked a random of yours
16:13 pdurbin thanks, makes sense
16:16 pdurbin pameyer: I got a private reply to the 1.5 TB upload thing
16:17 pdurbin pameyer: basically, it's "use S3"
17:04 pameyer it's an answer
17:04 pdurbin donsizemore: so you don't have a workaround, right? You can't publish that dataset?
17:10 donsizemore @pdurbin it published; it just took a long time and the "publish" button didn't disappear (or reappeared when it shouldn't have(?))
17:11 donsizemore akio has it down to a session timeout, which is strange because the timeout occurred 45 minutes in, Dataverse defaults to 8 hours, and Shib defaults to... greater than 45 minutes.
17:12 pdurbin donsizemore: ok, if things are not right, please email support@dataverse.org
17:12 donsizemore @pdurbin i think we're fine, i'm just filing this as a snafu
17:13 pdurbin phew
17:26 axfelix joined #dataverse
17:45 djbrooke joined #dataverse
18:15 djbrooke joined #dataverse
18:26 axfelix joined #dataverse
18:46 axfelix joined #dataverse
18:58 djbrooke joined #dataverse
19:15 djbrooke joined #dataverse
19:22 djbrooke joined #dataverse
19:39 pameyer joined #dataverse
20:03 pameyer joined #dataverse
20:13 djbrooke joined #dataverse
20:31 djbrooke joined #dataverse
20:42 pameyer joined #dataverse
20:43 axfelix joined #dataverse
21:06 djbrooke joined #dataverse
21:15 pameyer joined #dataverse
21:18 djbrooke joined #dataverse
21:27 djbrooke joined #dataverse
21:57 axfelix joined #dataverse
22:03 djbrooke joined #dataverse

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.