IRC log for #dataverse, 2018-05-17

Connect via to discuss Dataverse (, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

All times shown according to UTC.

Time S Nick Message
01:06 jri joined #dataverse
04:06 jri joined #dataverse
06:57 jri joined #dataverse
12:03 jri joined #dataverse
12:18 jri joined #dataverse
12:27 donsizemore joined #dataverse
13:28 andrewSC joined #dataverse
14:14 pameyer joined #dataverse
14:19 donsizemore @pameyer morning sir =)
14:19 Venki joined #dataverse
14:19 pameyer @donsizemore morning
14:19 pameyer @donsizemore how are file descriptors treating you?
14:20 donsizemore SUCCESS; 54 harvested, 0 deleted, 0 failed.  [dls@irss-dvn4test ~]$ sudo ls -al /proc/13441/fd |grep -c export_ 54
14:20 Venki @pameyer and @pdurbin good morning
14:20 donsizemore @pameyer we're on akio's doubly-patched warfile; GC hasn't cleared them yet
14:20 pameyer @Venki - good morning
14:21 pameyer @donsizemore this is after a recentish reboot, right?
14:21 Venki I need help on the metadata block upload... I did upload a metadata block with chinese characters and also some controlled vocabulary
14:21 donsizemore @pameyer well, after i stupidly stopped glassfish to apply a patch which didn't require that i stop glassfish
14:21 Venki In the UI when I click the dropdown box for the controlled vocabulary I get only the first value
14:22 pameyer @donsizemore - yeah, we had some high-priority updates to handle too.  I think kcondon may be taking a look at the 2nd patch (not sure single-patched or double-patch) today
14:23 donsizemore @pameyer i'm not worried about it, just would be nice to track down
14:23 pameyer @Venki - were these chinese characters ones that required changing the character set?
14:23 pameyer @donsizemore I agree about tracking it down.  I was hoping I'd get the time to do a global cleanup of all the ~89 "resources to close"; but with other stuff going on that hasn't happened yet
14:24 donsizemore @pameyer i suspect sonarqube sounded some false alarms... but akio was pretty confident about this second one
14:24 Venki Sorry you mean the dropdown box value?
14:25 pameyer @donsizemore I know that some of the sonarqube alarms are definately false - but I also suspect memory leaks in the app under various conditions
14:25 pameyer @Venki I'd meant the text encoding for the TSV file for the custom metadata block with the chinese characters
14:28 Venki We converted the excel file using Google Sheets to TSV file with encoding because Office saved them as garbage
14:28 Venki Did I answer your question...if not I am sorry
14:29 Venki Anyway to share the file or screenshot...
14:30 pameyer @Venki - if you could post a link to that TSV, that might be helpful.  I'd been wondering about UTF-8 vs UTF-16 vs unicode, and if that could be causing problems
14:30 pameyer it's not really my area of expertise
14:31 pdurbin Venki: hi! A screenshot would be great! Could you please open a GitHub issue?
14:31 Venki Hi Phil
14:32 Venki I am not sure whether I am doing the right I would like to confirm before I open a GitHub issues
14:33 pdurbin Venki: no problem. I often upload screenshots to so you could try that.
14:34 pdurbin pameyer: did you see ? Seems like this has potential to affect you.
14:34 pameyer pdurbin: I did see it, but thanks for pointing it out
14:34 pdurbin donsizemore: thanks to you and Akio for all the work on file descriptors.
14:35 pameyer I thought that emails leaking was something that was in the ITs; but maybe not all possible places
14:35 Venki
14:37 Venki
14:37 Venki
14:38 Venki The screenshot along with the links to the excel file and the TSV file
14:40 pdurbin Venki: thanks! I'm looking. So for "attributionFunction" you're expecting to see 17 values in the drop down but you're only seeing two. And one of them is a question mark rather than a character.
14:41 Venki yes
14:41 Venki Thats right...
14:41 Venki Thats the same for all the dropdown box (there are 5)
14:41 pdurbin Venki: can you please email this screenshot and the excel file to
14:41 Venki Only first value is shown
14:41 Venki Ok will do now
14:44 Venki Done. Thanks for the help @pameyer and @pdurbin
14:44 pdurbin Venki: thanks! Did you know that Sonia and I were in your part of the world last week? :)
14:45 Venki Yeah I saw the photos posted in chat yesterday... thought would catch up with you when I come for User group meeting...
14:46 Venki Hope you had good time... Next time come over to Singapore...
14:46 pdurbin Venki: oh, it's great that you're coming!
14:46 pameyer Venki: no problem.  btw - tresorit wasn't happy with my browser.  but the question marks in the screenshot make me thing that text encoding might be the cause after all
14:47 pameyer the default metadata blocks are ASCII or UTF8, and encoding is a place I've had things go wrong before
14:47 pdurbin pameyer: I had the same thought. I wonder if it would work with ASCII values. It would be a bug if it doesn't work with the characters Venki is using.
14:48 Venki @pameyer google spreadsheet exports CSV file as UTF8 encoded but I am not sure about tab delimited.. I am checking google now
14:48 Venki @pameyer the curl output seems to be ok..
14:50 pameyer @Venki - it's always possible my suspicions are incorrect (aka - curl output looking correct)
14:52 Venki @pameyer please check out the screenshot
14:52 Venki
14:55 pameyer @Venki - that does look what I'd expect.  and from a quick double-check, dataverse dataset pages are using UTF-8 encoding to browsers
14:56 Venki @pameyer so how can I convert the excel file to UTF-8 TSV file
14:56 Venki Any suggestions please
14:58 pameyer I'm not sure about excel.  but one way you could check (from a unix system) would be to run `file Daoist-Changed.tsv`; that should report the encoding
14:58 pameyer if that's correct, the problem is definately elsewhere
15:00 pdurbin Venki: for me it looks different. Rather than two in the drop down I get one and it says "_(Compiled". Also, I see that you're changing the title in the citation block. I didn't think this was possible from a custom metadata block:
15:01 pdurbin pameyer: if it helps, this is what I see when I run `file venki.tsv`, which I exported from Excel: venki.tsv: Non-ISO extended-ASCII English text, with CR line terminators
15:02 pameyer pdurbin: that does help
15:05 pameyer I'm not sure of a good way to get it to UTF-8 (maybe UTF-16?), but it's consistent with the idea of that being the source of the problem
15:05 Venki @pameyer this is what I get when I run the command in Linux Daoist-Changed.tsv: assembler source, UTF-8 Unicode text, with CRLF line terminators
15:05 pdurbin I don't know. I don't see an export options in Excel.
15:06 pameyer @Venki - great, I'm wrong
15:06 Venki My suspicion is - something to do with the controlled vocabulary... Why does it stop with the first value which is 0
15:07 pameyer *might* be worth double-checking the postgres encoding config, but that's less likely (assuming the API call is returning values from the db)
15:07 pameyer stopping at the first value would suggest something being interpreted as a end of line character
15:07 pameyer ah - CLRF line terminators
15:08 Venki @pdurbin sorry I didnt understand your question about the title field
15:08 Venki we are infact using a new field called textTitle
15:08 pdurbin Venki: it would be best if you don't modify the citation metadata block. I see you added "Full title in Pinyin Romanization
15:08 pameyer @Venki - good catch
15:09 pdurbin but the surprising thing is that you're able to modify the citation metadata block without touching citation.tsv. I feel like this is a bug. It shouldn't be allowed.
15:10 pdurbin don't give pameyer any ideas :)
15:12 Venki @pdurbin interestingly when I did a test with another file I use author as one of the fields in my custom metadata block and it showed up in the new metadata block
15:12 pameyer pdurbin: I've been down that path, and learned my lesson
15:14 Venki
15:15 Venki Now the author field is not show in the citation metadata block in the dataverse page
15:15 Venki And it shows only in the custom metadata block that I created....
15:16 Venki So I think I have messed up the citation metadata
15:16 Venki I need to use the citation.tsv file to replace it....
15:16 Venki Will that help?
15:17 Venki Ok Guys...thank you for the help...its 11:20PM here and I am gonna hit my bed...catch up with you all tomorrow. Have a great day.
15:48 pdurbin pameyer: good
16:49 dataverse-user joined #dataverse
16:52 jri joined #dataverse
16:58 jri joined #dataverse
17:36 Thalia_UM joined #dataverse
17:36 Thalia_UM I need your help
17:36 Thalia_UM Hi Philip
17:41 pdurbin Thalia_UM: hi! What's up?
17:43 Thalia_UM Right now I had a meeting about the repository we are using in dataverse and they changed the metadata schema request from Dublin Core to Datacite.
17:44 Thalia_UM On the platform you can export dublin core, ddi and JSON. How can I attach the other metadata schema?
17:53 pdurbin Thalia_UM: is what you want?
17:54 Thalia_UM yes
17:58 pdurbin Thalia_UM: can you please leave a comment on that issue?
18:00 Thalia_UM Do you know how to implement it?
18:01 jri_ joined #dataverse
18:01 pdurbin Thalia_UM: I think it has been at least partially implemented in this recent pull request:
18:02 pdurbin It looks like the dropdown would show "DataCite OpenAIRE".
18:05 Thalia_UM Yes
18:06 Thalia_UM That is what the institution is asking us
18:07 Thalia_UM but how much does one get involved, do I have to modify it from the database or the project?
18:07 pdurbin Thalia_UM: you could do code review of that pull request. That would be a great way to get involved.
18:08 Thalia_UM thank you philip
18:08 Thalia_UM Really thank you
18:08 Thalia_UM :D
18:21 pdurbin you're welcome :)
19:00 donsizemore joined #dataverse
20:07 dataverse-user joined #dataverse
20:29 pdurbin pameyer: `conf/docker-aio/ http://localhost:8080` worked great. Thanks. See
20:30 pameyer pdurbin: great! glad to hear it
20:31 pameyer also glad to hear about testing on more branches
20:31 pdurbin yeah
20:31 pdurbin I almost brought up your docker all in one
20:31 pdurbin but I think it's better to have centralized reporting in Jenkins
20:33 pameyer if everybody's always looking in one place, that does make more sense
20:34 pameyer btw - landreev pointed out to me that docker-aio had an issue with the jhove schema in setupIT.  Since I borrowed that from your phenix setup, it might be a problem there too
20:35 pdurbin pameyer: by the way, the ticket Venki opened about his custom metadata block is . I know you can't see it but I thought I'd drop it here so it appears in the logs for completeness in case someone finds their way here from a google search or whatever.
20:36 pameyer pdurbin: huh - I'd though he figured it out
20:36 pdurbin pameyer: yes, I saw your jhove commit at . Interesting. I'm not sure why phoenix seems ok without it. This may have been something I did manually on the phoenix server.
20:37 pdurbin no, Venki is still blocked
20:37 pameyer "CRLF line terminators"
20:37 pameyer the jhove scheme didn't seem to break anything, but it was filling up the glassfish logs unhappy with being unable to find it
20:38 pdurbin gotcha
20:39 pameyer I might be wrong about CRLF (I was wrong about UTF8 ;)
20:39 pdurbin What we need is a more sane way to get these custom metadata blocks into Dataverse.
20:40 pdurbin People are getting braver and actually trying these days. :)
20:40 pameyer last time refactoring came up, I put in an omnibus "let's redo the metadata system" issue
20:41 pdurbin ah, you want harmony:
20:42 pameyer using table oriented data structures for hierarchical data always seemed unnatural to me
20:43 pameyer when I was taking a quick look at 4683, I figured out how to hook a debugger up to docker-aio
20:43 pdurbin cool
20:43 pdurbin maybe you could update the readme
20:44 pameyer yup
20:44 pameyer but if anybody shows up with burning interested in windows dev, that *might* be the last piece you need
20:45 pdurbin I'm not holding my breath, but I did see a lot of Windows in Indonesia. Oh, I posted this a few hours ago if anyone is interested:
20:46 pdurbin I wrote it quickly so please let me know if you spot any typos. :)
20:47 pameyer very cool - no typos jumped out
20:48 pdurbin phew
20:48 pdurbin thanks for looking
20:48 pameyer did I miss the slideshow?
20:48 pameyer or has that not happened yet?
20:48 pdurbin no, no. hasn't happened yet. maybe next week (not monday)
21:42 pameyer left #dataverse
21:45 Venki joined #dataverse
22:07 pdurbin Venki: you're back! But now I'm bringing my kids to the park for a picnic dinner. :)
22:13 Venki pdurbin: Ha Ha have fun
23:23 pdurbin joined #dataverse

Connect via to discuss Dataverse (, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.