Time
S
Nick
Message
08:40
jri joined #dataverse
10:22
sivoais joined #dataverse
10:33
stefankasberger joined #dataverse
11:12
Youssef_Ouahalou joined #dataverse
11:15
Youssef_Ouahalou
Hello everyone, when we add a new metadata, how do we make it appear in ddi output?
11:51
bro joined #dataverse
11:56
pdurbin
Youssef_Ouahalou: well, it should appear in the native JSON output. Does it?
11:58
Youssef_Ouahalou
Yes it appears in json
12:01
pdurbin
Good. The native JSON output is flexible in that all the values are shown. All the other formats included DDI are hard coded. Does the field you're adding map to DDI cleanly? Is there a place for it in DDI?
12:04
Youssef_Ouahalou
Can I add it manually?
12:07
pdurbin
You would have to fork the code and edit some Java. Could this field be used by other installations of Dataverse? Or is it somehow specific to your institution? I'm wondering if making a pull request makes sense, to bring the feature to everyone.
12:08
MrK joined #dataverse
12:10
Youssef_Ouahalou
I think it is specific to our institution, it would be interesting if it were more flexible
12:13
Guest33048
Hi everyone! I have a question about the controlled vocabularies and their translations. I've manage to translate some of my controlled vocabularies by doing as said in the doc (http://guides.dataverse.org/en/latest/admin/metadatacustomization.html?highlight=properties%20underscore) but I couldn't manage to make it work for values that have numbers or special characters. So I wondered if I could use the identifier from the controlled vocabulary part
12:15
pdurbin
Guest33048: that's... strange. Numbers and special characters should work, I think. Can you share your TSV file?
12:15
pdurbin
Youssef_Ouahalou: are you saying it would be interesting if DDI is more flexible?
12:16
Youssef_Ouahalou
To enter new ddi fields
12:20
Guest33048
pdurbin: oh, then I guess I did somethings wrong (I've made many tries for different problems so I think I must have made a mistake). I'll try again. The only things to convert are spaces and uppercase right? So, for example, "Word 1: word2" should become "word_1:_word2"?
12:20
pdurbin
Right but DDI is a standard (unlike Dataverse's native JSON format). So you have to map your institution-specific field to the standard somehow, right? You can't just invent a DDI field, I think. That said, there was a recent thread on the mailing list about some upcoming flexibility in DDI. Let me go find it, Youssef_Ouahalou
12:22
pdurbin
Guest33048: I'm sorry but without seeing your TSV I'm having a little trouble picturing in my head what it looks like. But it's also early and I just had my first sip of coffee. :)
12:25
pdurbin
Guest33048: you'd be very welcome to create a GitHub issue and attach your TSV to it. You might need to add .txt to it in order to attach it: https://github.com/IQSS/dataverse/issues
12:26
Guest33048
pdurbin: I've put an extra underscore in my property file. Sorry for the inconvenience! However can you confirm that the identifier in the TSV cannot be used as a property name?
12:27
pdurbin
Youssef_Ouahalou: this is the thread I'm thinking about but it has more to do with supporting more scientific disciplines than supporting fields that are specific to an institution: https://groups.google.com/d/msg/dataverse-community/uKretKox_io/VTZsb49MAQAJ ("custom metadata blocks now easier to spin up and evaluate") ... Guest33048 you may be interested in that thread too.
12:29
pdurbin
Guest33048: when I'm at my desk it would be easier to confirm. Can you please email support dataverse.org so that I or someone else can get back to you about this?
12:31
Guest33048
pdurbin: I'll do that. Thank you for everything, and for the thread that I bookmarked :)
12:31
Youssef_Ouahalou
Ok thank you very much i will read it,thank you for your answers
12:32
pdurbin
Guest33048: thanks! If you have any questions about that thread, please let me know! There's a lot to unpack. :)
12:32
pdurbin
Youssef_Ouahalou: perfect, thanks. Also, I think your boss emailed me. I'm reading it now. :)
12:34
Youssef_Ouahalou
Hahhaha yes I think ☺
12:40
pdurbin
Youssef_Ouahalou: I just found this: https://fosdem.org/2020/schedule/track/hpc_big_data_and_data_science/
12:44
Youssef_Ouahalou
Is this about your presentation?
12:50
pdurbin
Well, the lightning talks still haven't been announced. So I still don't know if my talk has been accepted. I'm just poking around the FOSDEM site trying to figure out where it might be good to spend some time. There's also a room for Java and a room for PostgreSQL.
12:52
Youssef_Ouahalou
yes I also saw that, it could be interesting. Hoping that your speech will be accepted
12:52
poikilotherm joined #dataverse
12:55
poikilotherm
Mornin' guys. What's up?
12:56
MrK
Hi
12:56
Youssef_Ouahalou
Morning nice and you
13:01
Benjamin_Peuch joined #dataverse
13:03
Benjamin_Peuch joined #dataverse
13:04
Benjamin_Peuch
boop
13:04
Benjamin_Peuch
Oh this works now. Hello everybody, this is Ben from the State Archives of Belgium
13:05
Benjamin_Peuch
Hello Philip. I'm Youssef's coworker :)
13:06
poikilotherm
Welcome Benjamin_Peuch
13:09
Benjamin_Peuch
Thanks, poikilotherm. We are glad that we finally got down to setting up a Dataverse and seeing how far we can adapt it to our needs
13:10
Benjamin_Peuch
We are planning to launch it in a couple of months. Youssef has worked hard to this end, and I'm told he got a lot of support from you all
13:20
pdurbin
Benjamin_Peuch: hi! I got your email!
13:21
pdurbin
Yes, lot of us in here are helping Youssef_Ouahalou and others get their installations of Dataverse launched. :)
13:22
pdurbin
stefankasberger: hey, did you get my "mysterious message from Vienna" email? :)
13:23
Benjamin_Peuch
We greatly appreciate the support, pdurbin. :)
13:24
Benjamin_Peuch
Youssef told me about the exchange you just had about adding extra fields in Dataverse.
13:24
Benjamin_Peuch
We do plan to do this properly, so the output in DDI would absolutely be compliant with DDI-Codebook 2.5.
13:25
Benjamin_Peuch
I must say that some of those extra fields we want to add are meant to transfer to another XML language, Encoded Archival Description (EAD), for archival purposes.
13:26
Benjamin_Peuch
Still, we have identified which DDI elements we can use as intermediary vessels to this end. They would simply be <notes> element in <stdyDscr>'s <citation>.
13:26
pdurbin
Benjamin_Peuch: I'm flying back at 6am on that Monday. Friday *might* work. I'm arriving from Lisbon/PIDapalooza at 6:15am on that Friday. I wonder if it would make sense to host a "fringe" event about Dataverse at your institution: https://fosdem.org/2020/fringe/ . I need to spend more time looking at my calendar though. :)
13:26
Benjamin_Peuch
That would be wonderful! We would make sure to prepare a lot of coffee. :)
13:27
Benjamin_Peuch
We know of Dataverse users in France, Austria (hello stefankasberger!) and the Netherlands. They might be interested, especially if they plan to attend FOSDEM'20?
13:27
pdurbin
My hotel is only a 29 minute walk from your institution (though I'd probably take public transportation). :)
13:27
Benjamin_Peuch
Handy!
13:28
pdurbin
I've been waiting to start a thread on the dataverse-community list about FOSDEM until I know if my lightning talk has been accepted or not. A lot of the rest of the schedule is up already. Longer talks, dev rooms, etc.
13:28
Benjamin_Peuch
I should also mention we took good note of the advice in DV's manual, and we do plan to publish our metadata input.
13:29
pdurbin
MrK: so this is another potential meetup in Europe. Perhaps. If we can pull it together. :)
13:29
Benjamin_Peuch
Yes... Fingers crossed. I really hope you got to make this presentation.
13:30
Benjamin_Peuch
*get
13:30
pdurbin
Benjamin_Peuch: the thing I'm trying to figure out about your custom metadata fields is this... Can they be used by other installations of Dataverse? If so, we should work toward a pull request so everyone can benefit.
13:30
Benjamin_Peuch
That is also a thing indeed, pdurbin.
13:31
pdurbin
Otherwise, I fear you'd be forced to run a fork.
13:31
Benjamin_Peuch
We had quite a few modifications for Dataverse in the gears until we realized that it had not been developed so as to be very heavily customized, especially regarding the core metadata (Citation and Terms essentially, I would say).
13:32
Benjamin_Peuch
I must admit we interpreted the notion of open source a bit too freely. Indeed that was also my conclusion after you voiced concerns about departing too much from the original software (and our head of IT developments was of the same mind): it would pretty much amount to forking.
13:32
pdurbin
Ok. Lots of people have forked Dataverse. You aren't alone. But ideally, we would merge changes into upstream, if we can.
13:34
Benjamin_Peuch
Yes, that was also our plan. We thought we would take another approach and publish information about our needs (as archivists, data specialists) and that of our users (we only have a few at the moment, but they're qualified social scientists) so that it's open for discussion, and we can see from here what can be integrated in DV in the short, medium or long term.
13:34
Benjamin_Peuch
I know Youssef already opened a few issues on your GitHub and I don't want to spam you, so I'm going to look into it and synthesize our points.
13:35
stefankasberger
@pdurbin: Yes. It was Lars Kaczmirek, CEO of AUSSDA (my boss).
13:35
pdurbin
stefankasberger: oh! I never got a follow up email from him. Thanks!
13:36
pdurbin
Benjamin_Peuch: honestly, often is easier to get quick answers here than wading through old GitHub issues. But you're welcome to go wading. :)
13:37
Benjamin_Peuch
Oh okay. Does that include saying "Hey, we want this! Implement that!"? :p
13:38
pdurbin
Benjamin_Peuch: something else I'd like to put on your and Youssef_Ouahalou's radar is that DANS (I think) is transforming Dataverse's native JSON to various XML using a crosswalk outside of Dataverse. It's on GitHub somewhere.
13:38
stefankasberger
Yes, cause I will get in touch with you, once i have time. Would it be possible to have a short call tomorrow to talk about some things?
13:38
poikilotherm
pdurbin will most likely say: please open an issue for that @Benjamin_Peuch
13:38
pdurbin
stefankasberger: tomorrow is looking quite open for me. After that I'm on holiday until Jan 2.
13:39
Benjamin_Peuch
Oh that sounds very interesting. Thanks pdurbin. I assume they might be doing this with the DataverseEU project.
13:39
pdurbin
Oh, and we have Zoom at Harvard now, which is nice. I hosted my first Zoom with the new maintainer of dataverse-client-r on Tuesday.
13:39
stefankasberger
Me the same. I will get in touch with you, okay?
13:41
pdurbin
stefankasberger: sure, let's pick a time.
13:41
pdurbin
Benjamin_Peuch: I'm having trouble finding the crosswalk repo :(
13:41
stefankasberger
11/12h CET?
13:42
donsizemore joined #dataverse
13:42
pdurbin
If memory serves, Slava was transforming Dataverse's native JSON to an XML format used in their in house system called EASY.
13:42
Benjamin_Peuch
pdurbin: That's okay. I'm in touch with Marion and Slava. I can ask them directly. Thanks. :)
13:42
pdurbin
Benjamin_Peuch: great. If you find the repo, please open an issue so we can add it to the guides.
13:50
Benjamin_Peuch
Will do.
13:58
Benjamin_Peuch left #dataverse
14:02
donsizemore
@pdurbin may i pester you about the python3 installer when you have a minute?
14:10
poikilotherm
donsizemore: could you please just use ansible for that and bundle it?
14:12
MrK
pdurbin: what kind of meeting :P?
14:12
poikilotherm
donsizemore and pdurbin: or maybe use sth like https://github.com/jordansissel/fpm to create real packages, with ansible behind the scenes?
14:13
poikilotherm
That's how gitlab does it: they provide packages with a bundled chef installer doing all the hard work for them
14:31
stefankasberger joined #dataverse
14:50
pdurbin
Wow, -5F (-21C) with wind chill. Glad I wore snow pants to bike in. I hear in Australia they have record heat.
14:51
pdurbin
donsizemore: please pester away
14:51
pdurbin
MrK: nothing concrete yet. :)
14:59
donsizemore
@pdurbin so, one of the long-running criticisms of the ansible role is that it's not idempotent, because it's a wrapper for the dataverse installer, which itself is not idempotent
15:00
donsizemore
@pdurbin i can remove some portions of the ansible role to allow the installer to do it all (fine) but that's a step backwards from an idempotence perspective
15:01
pdurbin
yikes
15:01
pdurbin
So how do we move forward instead of backward? :)
15:02
donsizemore
regarding the installer, one first step might be the addition of a --no-db flag (in addition to the --db-only) flag
15:03
donsizemore
under perl (and now python) the installer insists on creating the database as an admin user and creating the tables up front
15:04
pdurbin
Well, aren't the tables created by the deployment of the war file? I'm agreeing with you but trying to get more specific.
15:04
donsizemore
this would begin to pave the way for normalizing the upgrade process
15:04
donsizemore
the perl installer crabs "a database exists! i'll only install onto a squeaky-clean postgres" or something similar
15:05
donsizemore
so a --no-db flag would relieve the first such blocker to repeated runs. (or i can just let ansible call the script directly, but thought i'd ask)
15:06
pdurbin
donsizemore: should we get on a Zoom with Leonid?
15:07
donsizemore
that'd be fine with me, just thinking about repeated installs, ("using Ansible to manage upgrades" as has been long requested), and direction/design
15:09
pdurbin
Sure. And he might want to pick your brain about the rewrite. He started from your work. Also, I recently heard about a POP concept I'd like your take on. Let me go find it.
15:10
MrK
pdurbin: Installer is now in python?
15:10
pdurbin
MrK: in development. Do you prefer Perl?
15:10
MrK
pdurbin: No I'm happy its in python :p
15:11
pdurbin
donsizemore: here's where I heard of POP: Making Complex Software Fun And Flexible With Plugin Oriented Programming - Episode 240 - https://www.pythonpodcast.com/plugin-oriented-programming-episode-240/
15:12
pdurbin
My question is if you (and others here) think it's overkill to consider a POP architecture for the installer.
15:20
donsizemore
i mean, this is proper design =) but as you say, the installer is just an installer.
15:21
pdurbin
yeah
15:21
pdurbin
Let me walk down the hall and see if Leonid is around. And if anybody made coffee.
15:26
poikilotherm joined #dataverse
15:30
pdurbin
donsizemore: good news. There's coffee and Leonid is happy to do a video call tomorrow any time after standup. Will you be around? Should we invite poikilotherm and MrK and anyone else who has strong opinions about the installer? :)
15:30
poikilotherm
:-D
15:31
poikilotherm
The strongest opinions are about those pesky resource creations inside glassfish
15:31
pdurbin
MrK poikilotherm: Mozilla just announced they're moving from IRC to Matrix: https://discourse.mozilla.org/t/synchronous-messaging-at-mozilla-the-decision/50620
15:31
poikilotherm
Everything is very likely not to tangle me for K8s ;-)
15:33
poikilotherm
+else
15:34
donsizemore
@pdurbin i have a lunch close to noon but otherwise i'll be here
15:35
pdurbin
donsizemore: ok, should we squeeze it in before your lunch?
15:35
donsizemore
@pdurbin fine by me. it's just a design question, really (and i like to pick a path before picking up the rake and lopping shears)
15:39
pdurbin
donsizemore: cool. I just started a doc with talking points. Can you please add some bullets? https://docs.google.com/document/d/1CSPTkFT8i38a0rZShKdoHL4UchiD_njc44-cqxVFc7k/edit?usp=sharing
15:40
donsizemore
@pdurbin that's my basic question =) and it's not really my question (I pretty much use Ansible as a one-shot installer) but a bunch of community members have asked for idempotence
15:42
pdurbin
Ok. Well, I already put a bug in Leonid's ear about idempotence. And he mentioned there are some flags already that might help. If we think of other talking points, we can add them. If not, we can just use that doc for notes.
15:46
poikilotherm
pdurbin: it was too tempting to add some points ;-)
15:48
poikilotherm
pdurbin: I don't think I can make it - 1130 Boston = 1730 over here...
15:51
pdurbin
poikilotherm: that's why I created the doc. Thanks! Can you put your name next to your questions?
15:51
poikilotherm
Done
15:52
pdurbin
thanks!
15:52
poikilotherm
Shall I do a quick browsing through the installers to nail down some more questions?
15:53
pdurbin
If you mean the new branch, sure, please take a look: https://github.com/IQSS/dataverse/compare/3937-rewrite-installer-in-python-again
15:53
poikilotherm
No I meant the old installer and its pain points
15:53
pdurbin
You're looking for pain in that old Perl script?
15:53
poikilotherm
Nope
15:54
poikilotherm
Things like setup-all.sh etc
15:54
poikilotherm
Those things have to DIE, IMHO
15:54
Benjamin_Peuch joined #dataverse
15:56
pdurbin
Well, at least it's possible to install Dataverse from the command line. Better than forcing people to use a GUI . :)
15:56
poikilotherm
:-D
15:56
pdurbin
But yeah, let's make the installation process simple and awesome.
15:57
Benjamin_Peuch
:thumbup:
15:57
pdurbin
Are we inspired by any software that's simple to install?
15:57
poikilotherm
It would be awesome to have a cli tool called "dataverse-adm"
15:57
poikilotherm
Which does bundle simple things for you
15:57
poikilotherm
Like those curls to block api endpoint, set unblock key etc
15:59
poikilotherm
Or activate FAKE provider
15:59
poikilotherm
Hmm wondering if this might be a good fit into pyDataverse
16:01
pdurbin
sort of like asadmin?
16:01
poikilotherm
Yeah
16:01
pdurbin
And it calls into Dataverse APIs?
16:01
poikilotherm
Don't make people lookup curls, give them small helper tools
16:02
poikilotherm
This is all heavily inspired by Gitlab...
16:02
pdurbin
great idea
16:03
poikilotherm
Lots of installer stuff could be done in such a tool
16:03
poikilotherm
The new installer is another HUGE beast, which is not easy to maintain
16:03
pdurbin
What does GitLab write their adm tool in?
16:05
pdurbin
poikilotherm: did you see my post above about POP? Would that help make it less of a HUGE beast? Or should we use some other language besides Python? How can we make the new installer awesome?
16:06
poikilotherm
Gitlab seem to have a wrapper around rake tasks, so this is Ruby
16:06
pdurbin
bleh, ruby ;)
16:07
pdurbin
So you have to gem install it?
16:07
poikilotherm
I'm not saying this should be done in ruby
16:07
poikilotherm
Nope, in most cases (if not using K8s or similar) you will use packages (rpm/deb) and this is a simple binary shipped in the packages
16:08
poikilotherm
Its just a wrapper, you could use rake from a bundle install (as a gem)
16:10
donsizemore
@pdurbin did you want? can we push it back until afternoon EST?
16:11
pdurbin
donsizemore: sure, afternoon is fine. It's too late for poikilotherm anyway. What time works for you. Not too late on a Friday before a more-than-a-week holiday, please. :)
16:12
pdurbin
poikilotherm: ok, just a yum install or an apt-get install away. Good.
16:14
poikilotherm
pdurbin: I added a note about my idea to the doc
16:15
* pdurbin
looks
16:17
donsizemore
@pdurbin any time after say 1300 should be fine?
16:23
poikilotherm
Read you guys tomorrow
16:43
pdurbin
donsizemore: at standup just now Gustavo reminded us that you're on Slack. I just started a direct message thingy with you, me, Leonid, Kevin, and Gustavo.
19:17
pdurbin
Looks like Leonid just left a comment here: https://github.com/IQSS/dataverse/issues/3937#issuecomment-567605666
19:44
bricas joined #dataverse
19:56
dataverse-user joined #dataverse
19:56
dataverse-user
hi for all
19:56
pdurbin
how's it going, dataverse-user? :)
19:59
dataverse-user
fine thanks, have a question about harvard dataverse. If i want to storage the research data of my project in the harvard dataverse repository. how much its it? ,
20:00
pdurbin
Harvard Dataverse offers free data hosting if the data is under a certain size.
20:03
pdurbin
Harvard Dataverse is launching a new support/about/reference website soon that will answer questions like this. It's not quite ready but here's the issue where the work is being tracked: https://github.com/IQSS/dataverse.harvard.edu/issues/26
20:04
pdurbin
dataverse-user: does that help?
20:32
dataverse-user
@pdurbin: what happend if the data its different size or new institution not in consortium its required dataverse storage?
20:37
pdurbin
I think the quota is 1 TB right now. Your institution does not need to be part of any consortium. Individuals can even upload data.
20:44
pdurbin
donsizemore: this is new to me, might be of interest: https://github.com/jggautier/dataverse-metadata-scripts
21:02
pdurbin
dataverse-user: I assume you're talking about http://dataversecommunity.global . Not sure.