Time
S
Nick
Message
01:21
jri joined #dataverse
03:42
jri joined #dataverse
06:59
poikilotherm1
nightowl313 you should also have a backup of your index... A complete reindex of a larger installation is taking ages.
07:00
poikilotherm1
And yes, as long as your S3 bucket is not called the same name, you'll need to edit the storage identifiers
07:01
poikilotherm1
But why would you loose your S3 bucket? If that is broken, you are in deeper trouble...
07:02
poikilotherm1
And it should be entirely possible to restore a bucket to the same name it had before if were to accidentially delete it
07:15
jri joined #dataverse
07:19
jri joined #dataverse
09:47
pkiraly joined #dataverse
09:48
pkiraly
poikilotherm1, Hi Oliver, do you have some minutes?
09:53
poikilotherm1
Hi pkiraly
09:53
poikilotherm1
Just working on my slides for DCM
09:53
poikilotherm1
Hit me
09:57
pkiraly
It is not that urgent. I would like to talk to you a bit on the status of Payara and on the management of custom metadata blocks
09:58
pkiraly
but it is perfectly fine if we do that after DCM
10:03
poikilotherm1
Hit me
10:09
pkiraly
Payara: I had a Dataverse tutorial this morning, and somebody asked about Glassfish, and asked if there is an alternative. I know that you work on Payara for a while. Do you have an estimation when it will be production ready to install Dataverse on it?
10:11
poikilotherm1
The official statement about that: it will be done when Dataverse 5 is released.
10:12
poikilotherm1
It will still pretty much depend on the Payara ecosystem inheriting from Glassfish. Porting it to other app servers like Wildfly, Liberty etc should be possible, but will require lots of work and extentive testing
10:12
poikilotherm1
But it will be a lot easier than coming from good ol GF 4.1 :-D
10:13
pkiraly
Great!
10:13
poikilotherm1
Although pdurbin will kill me for this: I don't see Dataverse transformed to a Quarkus or Spring application. The task is too big for a scientific open source project
10:16
pkiraly
I do not have enough experience to judge it
10:17
poikilotherm1
Maybe one day there will be a Dataverse-NG building on what the crazy folks at Warsaw did...
10:17
pkiraly
Custom metadata blocks: in the next weeks I will work on an old idea of mine. Instead of modifying Solr schema file, firing Solr Schema API . It affects your previous work on this field, as the script you wrote will be no needed (if my approach works). I have a question however: do you know if are there others using xinclude in Solr schema?
10:18
poikilotherm1
;-) Just wanted to paint a more complete picture about where we are in the Java ecosystem...
10:20
pkiraly
The Dataverse source code is quite mixed, reflects different coding paradims and styles. I can not estimate how much work would be (in person month) to unify the codebase under technology X.
10:21
pkiraly
And it definitively can not be done with baby steps, only in one giant step...
10:22
poikilotherm1
Well one of those giant steps was getting of GF 4.1
10:23
poikilotherm1
The code base has some very dusty and even ugly places. But fixing all of 'em is quite hard because that would be too much workload at IQSS IMHO
10:24
poikilotherm1
It's not so much about technology, more on the human, politics and change management side of things
10:25
poikilotherm1
Some people's life is dedicated to the well being of this project, so we need to be carefull ;-)
10:26
poikilotherm1
Hey it's been 2 years for me of constant talk and people like pdurbin spreading the word and acting as amplifiers to get things moving. I'm so glad we all are together on this way in this amazing project.
10:29
pkiraly
I agree
10:35
pkiraly
What about metadata block? Do you have an objection against my plan or things I should take care in the implementation?
10:48
poikilotherm1
Could you give me quick overview of your latest plans on this?
10:50
pkiraly
Sure. In Solr there is a Schema API , which lets the admin to add, modify or remove fields programatically via REST calls.
10:52
pkiraly
To turn it in the Solr config file we have to change the mode of schema management, and the schema.xml should be renamed to managed-schema.
10:53
pkiraly
If this step is done, we can write the Solr field manipulation function within Dataverse, and we do not need any extra step to fetch the list of fields, and add them to Solr.
10:53
pkiraly
We can trigger the whole process with a single Dataverse API call
10:54
pkiraly
That's the essence. I would like to make this process a kind of transactional, so it happen only if no error occured meantime, and excluding partial success..
11:08
poikilotherm1
Would it make sense to use a migration tool like Flyway for this?
11:08
poikilotherm1
There are other ideas floating around, too. Jim and I were talking about adding metadata to NoSQL datastore because of the issues with Solr schemas
11:08
poikilotherm1
(Thus thinking in the direction of becoming schemaless)
11:09
poikilotherm1
How would you like to handle metadata schema changes?
11:10
poikilotherm1
I see a lot of open questions on this, what Dataverse could do for us and where to draw the line
11:11
poikilotherm1
Oh that NoSQL stuff was an idea for using MongoDB to store additional metadata, which does not fit in available schemas, maybe because there is no standard for a community etc
11:42
donsizemore joined #dataverse
12:23
pkiraly
To make it schemaless in Solr is also a possibility, I use that elsewhere. In Solr schema you can use * as a wildcard, and treat *_ss as a string type filed, so author_ss, title_ss etc. will be indexed.
12:25
pkiraly
MongoDB could not solve the searching functionalities I think
12:26
pkiraly
schema change: it depends. As I mentioned the Solr Schema API is able manage some changes. Other changes require reindexing.
12:27
pkiraly
But the same is true for the current situation
13:27
pdurbin joined #dataverse
14:18
JonathanNeal joined #dataverse
14:19
pdurbin
poikilotherm1 pkiraly: you should go look at the DVN 3 code, which was even worse. :)
14:19
pdurbin
From my perspective, things are slowly getting better. :)
14:26
pkiraly
pdurbin, Hi, did not say that it is good or bad, just that it is in mixed style, so it would be some effort to turn it to a unified style.
14:27
pkiraly
pdurbin, Today I just heard that a new Dataverse instance is on the horizon in Hungary.
14:35
dataverse-user joined #dataverse
14:35
pdurbin
Yes, definitely mixed.
14:36
pdurbin
And yes, some folks from that installation have been in this chat room, I believe.
14:40
pdurbin
pkiraly poikilotherm1: What do you think about the new tiny Java microservice in https://github.com/IQSS/dataverse/pull/6986 ? Is this a chance to use Quarkus or whatever?
14:45
poikilotherm1
I'm crying :'-(
14:45
pdurbin
Why?
14:49
poikilotherm1
I might be a fan of over-engineering, but hacks in Dataverse seem to me of having a long history of becoming a defacto standard.
14:50
pdurbin
Oh, sure, plenty of hacks.
14:50
pdurbin
But which hack do you mean? :)
14:51
poikilotherm1
The very PR you mentioned
14:52
poikilotherm1
My heart is bleeding :-(
14:52
donsizemore
@poikilotherm1 well clean it up, you'll ruin the carpet.
14:54
pdurbin
Or buy darker carpet.
15:20
dataverse-user joined #dataverse
15:43
poikilotherm1
Guys if you are interested, there is a first draft of my DCM slides. I appreciate any feedback. It's a 10 minute lightning talk. http://talks.bertuch.name/dcm2020/#/
15:46
* pdurbin
clicks
15:46
pdurbin
poikilotherm1: by the way, your pull request is in QA now. For email groups.
15:49
poikilotherm1
Oh I didn't notice :-D
15:50
pdurbin
poikilotherm1: the diagram is cool but "dataverse from cmdline" is hard to read. Orange on dark grey, almost black.
15:54
pdurbin
Yes to this: Provide a Java-based Operator and reuse existing
15:54
pdurbin
poikilotherm1: oh... actually. Is that a typo? Existing what?
15:55
poikilotherm1
Existing operators for Solr and Postgres
15:55
pdurbin
ok, it's cut off for me
15:55
poikilotherm1
No need to write those ourselfs ;-)
15:56
pdurbin
oh, you mean for it to be cut off
15:56
pdurbin
might help to add a period at least
15:57
pdurbin
also, typo on the last slide: "dailyp"
15:57
pdurbin
otherwise, looks great!
15:57
pdurbin
Thank you for dragging us into the present and future!
15:58
poikilotherm1
Anything missing?
15:58
poikilotherm1
To much detail?
15:59
poikilotherm1
Too long for 10 minutes?
15:59
pdurbin
I guess what I want to know is how we can enable what you're doing. You wanted us to get off Glassfish 4. Done (in develop). Will you be talking about other stuff like this?
16:04
poikilotherm1
Nope. Slava requested a talk about the project and what it can do now. Do you feel like I should emphasize a bit more on the "Make Dataverse cloud-native!" part?
16:05
pdurbin
Well, it's fine to focus on what Slava wants. But you understand what I want. Maybe that could be a future talk.
16:06
pdurbin
Also, I'd be happy to trade practice talks if you'd like. I'm giving a 10 minute talk to introduce external tools.
16:07
poikilotherm1
Maybe :-D
16:07
poikilotherm1
BTW pdurbin. I had an idea just this morning about deployment times... Wanna hear?
16:08
pdurbin
sure!
16:10
poikilotherm1
Remember https://github.com/IQSS/dataverse/issues/5871 ?
16:10
poikilotherm1
:-)
16:10
poikilotherm1
My idea is as follows: everytime Dataverse is deployed, the complete code is scanned for entities and applied to the database. I wonder if it might give a speedup if we don't do that. And it would reduce warnings....
16:11
poikilotherm1
It should be much faster to lookup the flyway migration status in the database instead of doing scanning and applying with failures
16:13
pdurbin
So... I'm not opposed to doing whatever we can to speed up deployment. But isn't deployment slow because our war file is over 200 MB ?
16:13
poikilotherm1
This is just a theory. Dunno if this is a real issue.
16:14
poikilotherm1
And I have no idea how to measure before doing a fix to verify
16:14
poikilotherm1
I don't believe this is a single issue thing for deployment times. Technical debts are high in the codebase...
16:16
pdurbin
Sure. Multiple issues. Agreed.
16:17
poikilotherm1
OK guys, gotta go now... Construction site calling.
16:18
poikilotherm1
Read you later. Send me a ping if you need my attention :-D
16:18
poikilotherm1
(Looking at the co-chair @donsizemore here...)
16:18
pdurbin
o/
16:31
jri joined #dataverse
17:09
donsizemore
@poikilotherm1 looks good to me
19:04
jri joined #dataverse
20:12
poikilotherm1
donsizemore U still around?
21:25
jri joined #dataverse
21:47
jri joined #dataverse
22:06
JonathanNeal joined #dataverse
22:39
pmauduit joined #dataverse
22:39
yoh joined #dataverse
22:39
JonathanNeal joined #dataverse
22:39
pdurbin joined #dataverse
22:39
icarito[m] joined #dataverse
22:39
sivoais joined #dataverse
22:39
larsks joined #dataverse
22:39
bjonnh joined #dataverse
22:42
poikilotherm1 joined #dataverse
22:46
sivoais joined #dataverse
23:47
jri joined #dataverse