IQSS logo

IRC log for #dataverse, 2018-11-19

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
09:21 juancorr joined #dataverse
09:24 jri joined #dataverse
09:52 jri joined #dataverse
10:26 poikilotherm joined #dataverse
12:13 pdurbin joined #dataverse
12:14 pdurbin poikilotherm: thanks for looking into the NoSuchMethodError stuff at https://github.com/IQSS/dataverse/issues/5274
12:24 MrK joined #dataverse
12:25 poikilotherm Morning pdurbin
12:25 MrK Hi, are any of you using dataverse-docker? It seems that it's not working
12:26 pdurbin MrK: I don't think anyone here is but if something isn't working, can you please open a GitHub issue?
12:28 MrK pdurbin: Sure.
12:29 pdurbin MrK: thanks! When you're done, please drop a link in here. I'm curious about it.
12:36 poikilotherm pdurbin: I am unsure how to continue.
12:36 poikilotherm IMHO it doesn't make sense to patch this stuff to have a workaround for 4.1
12:36 poikilotherm This takes a lot of time
12:36 poikilotherm (Already looked into it and it is going into deep waters)
12:37 poikilotherm On the other hand I don't like to break your branch
12:37 pdurbin We don't want anyone to drown, especially not alone.
12:38 poikilotherm So as this won't be merged till sth like 5 is ready, I am stuck in circular deps
12:38 poikilotherm As far as I can see, the only way out with a reasonable invest of time and ressources is a huger step towards 5
12:39 poikilotherm I know you guys don't like that... Any ideas welcome
12:39 pdurbin bummer, I prefer small chunks but I hear what you're saying
12:40 poikilotherm Of course we could change the Jackson JARs manually in Glassfish like we do for WELD and stuff
12:40 poikilotherm But that is really going to be risky
12:40 poikilotherm I don't know what other places might blow up from that
12:41 poikilotherm And as there are not much integration or end 2 end tests around, there are good chances to introduce big regressions with this
12:41 pdurbin Patching Glassfish like that for Weld was a last resort. I want less required patching of Glassfish, not more.
12:42 poikilotherm And I have to admit, that I cannot give a profund opinion on this, as I am no Glassfish/payara expert... Maybe upstream payara support can help with this, but we don't have a license
12:42 pdurbin Neither do we.
12:42 poikilotherm I could look at Paraya 4.x
12:43 poikilotherm If this has more updated stuff included, this might be a workaround that is not so time consuming
12:43 pdurbin I like that Payara 4.x is patched regularly. Any news from Gustavo and Matthew?
12:43 poikilotherm None yet
12:44 poikilotherm Payara 4.x is only updated when you have a license
12:44 poikilotherm Otherwise you are stuck there, too.
12:44 poikilotherm But it could be a workaround the get #5274 merged
12:44 pdurbin No, you can always download the latest Payara 4.x release for free.
12:46 MrK pdurbin: https://github.com/IQSS/dataverse-docker/issues/10
12:46 pdurbin What I was trying to say is that Payara has regular releases: https://github.com/IQSS/dataverse/issues/4172#issuecomment-340774938
12:49 pdurbin MrK: thanks! Postgres problem. I'm not sure what's going on. You might want to note which commit you are on. In case someone doesn't get to it right away.
12:49 poikilotherm Yes, but please remember https://blog.payara.fish/payara-5-and-payara-4-development-changes - I don't know if you can download the updated stuff or if this is only usable for people with a subscription
12:50 pdurbin poikilotherm: huh, I don't think I've seen this.
12:50 poikilotherm "As previously announced, the Payara 4 Community Stream will stop with Payara Server 4.181 being the terminal release of the Payara 4 Community Stream. For Payara support customers, the Payara 4 Features Stream and the Payara 4 Stability Stream will both continue throughout 2018 and into 2019. "
12:50 poikilotherm This is just a workaround on a road to 5 IMHO!
12:51 poikilotherm (And only if the provided JARs can give us what Glassfish 4.1 cannot give us)
12:54 pdurbin MrK: I told tjanek at https://github.com/IQSS/dataverse/issues/5280#issuecomment-437335538 that I can suggest some non-controvesial issues for you two to work on. Do you need any suggestions? Any thoughts on bugs you might want to fix or features you might want to add?
12:54 pdurbin poikilotherm: still reading. Meanwhile, check this out please: https://gitter.im/eclipse/microprofile-config/archives/2018/11/19
12:59 pdurbin (sounds like there's an eclipse microprofile meeting in 6 hours, unless I have the time wrong)
13:01 pdurbin poikilotherm: you're right. I didn't know about all these streams and that patches go into the paid features stream and the paid stability stream but not the free community stream. I'll probably ask people in #glassfish for opinions but not until they wake up (California time).
13:02 pdurbin MrK: looks like Slava is helping already. Good.
13:06 MrK pdurbin: Ah yeah pretty fast respone, about your question, I would have to bring it up on our stand-up, but I think if we have some ready feature which we needed, we can make an issue so you would also benefit.
13:06 pdurbin MrK: that would be perfect. Please do go ahead and make issues for features you want.
13:13 poikilotherm pdurbin: I have neither Hangout nor a time slot available in the evening...
13:14 poikilotherm Ah ok, Ondrej just stated it starts in about 50 Minutens
13:22 pdurbin ah, thanks
13:31 poikilotherm Ha! Payara 4.1.2 has Jackson 2.9.4 onboard
13:33 poikilotherm pdurbin: do you think this is an option to include in a Dataverse 4.x release that people need to get Payara 4 in place?
13:33 poikilotherm (And maybe update to 5 shortly after)
13:34 poikilotherm (Or hopefully... ;-) )
14:10 pdurbin poikilotherm: now that you've informed me about their new release policy, I'm a little nervous about Payara. Are security patches held back from the community stream?
14:11 poikilotherm For the 4.x releases I think so
14:11 poikilotherm For 5.x this is no big deal
14:12 pdurbin I think it's a big deal if you have to pay for security patches. That's what PrimeFaces does and it drives me crazy.
14:13 poikilotherm 5.x will have the community stream till 2023
14:13 pdurbin yes but "We will continue development in the Payara 5 Community Stream with quarterly releases as before."
14:13 poikilotherm Sure
14:13 pdurbin Do you see what I'm saying? Maybe not.
14:13 poikilotherm I hear you :-)
14:13 poikilotherm You want sec fixes ASAP
14:13 pdurbin When Solr has a security issue, they release a new version to the community right away.
14:15 poikilotherm From the docs at https://docs.payara.fish/v/181/security/security.html
14:15 poikilotherm Download Security Fixes
14:15 poikilotherm Reported security vulnerabilities by the community or Payara Services Limited’s costumers are patched and released either in specific patch releases (for paid support customers exclusively) or quarterly releases. In some cases, we might release special hotfixes to the community to patch serious vulnerabilities that cannot wait for a quarterly release to be fixed.
14:15 poikilotherm To get the specific fix for a reported vulnerability, please download the specific release that fixed that vulnerability.
14:16 pdurbin wow, terrible
14:16 pdurbin I mean, Payara needs to make money.
14:16 pdurbin so does PrimeFaces
14:16 poikilotherm ;-)
14:16 pdurbin but this is abhorrent to me
14:17 donsizemore joined #dataverse
14:17 poikilotherm I dont have the expertise to know if this has been a major issue in the past. I can't tell wether there has been a major security bug left open for 2-3 months
14:18 pdurbin I don't know either. I'm just talking about the principle.
14:18 pdurbin In mob movies you have to pay for protection.
14:19 poikilotherm Yeah, maybe we should tell Kingpin  ;-)
14:21 pdurbin poikilotherm: can you please connect to https://eclipse.zoom.us/j/949859967 and tell me if the audio is choppy for you too?
14:21 pdurbin Or even better, leave a note in their Zoom chat like I did.
14:21 poikilotherm Gnrf... Need to install a client first
14:22 pdurbin poikilotherm: that reminds me. I talked to Danny about your trouble hearing in the Dataverse community calls and it sounds like the reason we have the system we do is that everyone has a phone.
14:22 poikilotherm :-D
14:22 poikilotherm WebRTC for the win
14:23 pdurbin Well, the audio is so bad for me that can barely tell what's going on, but at least there are notes at https://docs.google.com/document/d/1X6Q9K28VNHhVRkVBFlbNa-ZMusxjCfmoA7--uJFHWx0
14:26 poikilotherm The audio is pretty good over hear and the video too
14:27 poikilotherm -hear +here
14:28 poikilotherm Ondros audio is not as good as Emilys
14:28 poikilotherm But this is most certainly his setup
14:29 poikilotherm He seems to have no good headset
14:29 pdurbin Hmm, maybe I should switch from wifi to wired. Thanks.
14:30 poikilotherm But this worked pretty good
14:30 poikilotherm And I like video conferences :-D
14:31 isullivan joined #dataverse
14:32 pdurbin I was telling Danny that my preference would be something like Google Hangouts On Air (if that's still a thing) so that the call could be recorded.
14:37 pdurbin poikilotherm: the audio is fine from wired. Thanks for testing for me.
14:37 pdurbin I don't think I have any microprofile config questions but it's good to know that they have these calls.
14:40 pdurbin poikilotherm: at standup what should I say (if anything) about https://github.com/IQSS/dataverse/issues/5274 ?
14:41 poikilotherm joined #dataverse
14:47 poikilotherm joined #dataverse
14:48 poikilotherm Connection losses over here twice now... Reposting:
14:49 poikilotherm [15:43] <poikilotherm> Sry, had a connection loss over here. Just read in the logs that you wrote sth.
14:49 poikilotherm [15:43] <poikilotherm> Yes, please mention 5274 at standup.
14:49 poikilotherm [15:44] <poikilotherm> Payara 4.x is a workaround, but we should go for 5. I am unsure if changing the app server in a rather short time twice would be a good product strategy...
14:49 poikilotherm [15:45] <poikilotherm> But that's not for me to decide
14:49 poikilotherm [15:45] <poikilotherm> Once MicroProfile Config is part of Jakarta EE and in Glassfish 5.x, we can switch back.
14:49 poikilotherm [15:46] <poikilotherm> Oh and of course it is always an option to move completely away from Payara... ;-)
14:49 poikilotherm [15:46] <poikilotherm> OpenLiberty and Thorntail/Wildfly are available alternatives
14:49 poikilotherm [15:46] <poikilotherm> (And most certainly with other policies about releasing updates)
14:51 pdurbin poikilotherm: thanks, please feel free to add another comment about the Payara 4.x workaround. I dragged it into code review because you sound pretty blocked on getting a decision from us.
14:51 poikilotherm Yes, I am ;-)
14:54 pdurbin poikilotherm: great. Also, do you feel like splitting off the graphviz work into a separate issue and pull request? I was thinking you could make a little diagram for http://guides.dataverse.org/en/4.9.4/developers/intro.html#core-technologies (Glassfish pointing to PostgreSQL and Solr) or something. Just a suggestion. Could be a diagram for anything.
14:57 poikilotherm https://github.com/IQSS/dataverse/issues/5274#issuecomment-439920657
14:59 poikilotherm pdurbin: isn't that already part of http://guides.dataverse.org/en/latest/installation/prep.html#id5
15:00 pdurbin yeah, it is
15:00 pdurbin I was trying to think of a place where a diagram might be nice.
15:00 poikilotherm You really like the nice and shiny new stuff, don't you?
15:01 pdurbin http://guides.dataverse.org/en/latest/installation/config.html#network-ports is a wall of text
15:01 pdurbin poikilotherm: well, I spend time getting graphviz installed on Derek's Windows laptop.
15:01 poikilotherm :-D
15:01 poikilotherm Whooops ,-)
15:01 poikilotherm My bad
15:01 pdurbin And I worked with our Ops team to get graphviz installed on the server we use to build the guides.
15:01 pdurbin And Danny signed off on adding it.
15:02 pdurbin So since there's some momentum it might be nice to get something merged.
15:02 pdurbin But I don't have time to push it through myself.
15:02 poikilotherm Sure
15:02 pdurbin Well
15:02 pdurbin it's not even a time thing
15:02 pdurbin more of a process/policy thing
15:04 pdurbin and thanks for the new comment on the pom.xml issue
15:04 poikilotherm :-)
15:05 poikilotherm pdurbin: did I miss something? Shouldn't http://guides.dataverse.org/en/latest/installation/config.html#id48 contain some more text I added? Wrong release?
15:06 donsizemore joined #dataverse
15:06 pdurbin In which release of Dataverse was your doc change added?
15:07 poikilotherm I thought 4.9.4
15:07 poikilotherm But that might be wrong
15:08 poikilotherm Let me look at those milestones... :-D
15:08 pdurbin I was just looking at the one for 4.9.4: https://github.com/IQSS/dataverse/milestone/75?closed=1
15:09 poikilotherm Ah it is not released yet!
15:09 poikilotherm Maybe you could add it to 4.10?
15:10 pdurbin Is 4.10 going to be the next release? Or will it be 4.9.5?
15:11 poikilotherm Dunno
15:11 poikilotherm There is a 4.10 milestone
15:12 poikilotherm I thought you guys work in sprints?
15:13 pameyer joined #dataverse
15:13 poikilotherm Anyway. Should I rip out the essential stuff from 5274 and put it in a separate PR?
15:13 poikilotherm I could revert to the custom ZIP mangling but introduce the <depMan> plus docs anyway
15:14 pdurbin If our dev process is unclear, I'd suggest opening an issue so that we can add more deatil to the dev guide. I'm sure it's unclear. :)
15:15 poikilotherm Oh dear, you must think of me as a moron that barks a lot...
15:15 poikilotherm (if I open more issues like that...)
15:15 pdurbin poikilotherm: the only think I have a strong opinion about is that it would be nice to have the graphviz addition in its own issue and pull request. I would defer to Gustavo and Matthew for the pom.xml changes.
15:15 pdurbin thing*
15:15 poikilotherm Oh maybe you can mention that before or after standup to them? I am really eager to get things moving...
15:16 poikilotherm But I suppose I need their clearance
15:17 pdurbin the "essential stuff" idea? sure, I can mention this at standup
15:18 pdurbin poikilotherm: please check out my "the next version number can change at any time" at https://github.com/IQSS/dataverse/pull/5285/files ... it's hard for me to predict what the next release will be
15:19 poikilotherm pdurbin: I meant the response to my mail :-D
15:20 poikilotherm Sry, multiple things in parallel - I have not been clear...
15:20 * pdurbin scratches out "essential stuff" and writes down "reply to email"
15:20 poikilotherm Thx
15:20 pdurbin sure
15:20 poikilotherm -clear +precise
15:20 pdurbin heh
15:20 pdurbin and I hope you see what I mean about versions
15:21 pdurbin This is the query I use to figure out what will be in the next release: https://github.com/IQSS/dataverse/pulls?utf8=%E2%9C%93&q=is%3Apr+is%3Aclosed+is%3Amerged+no%3Amilestone ... 67 merged pull requests so far
15:24 * poikilotherm reads 5285
15:24 poikilotherm Sounds good
15:25 poikilotherm I don't know if I should create a PR with a release note for the changes of #4690
15:25 poikilotherm To support your experiment :-D
15:26 poikilotherm Have thought of a content structure? Or is this just an empty file?
15:27 poikilotherm Oh and about the release number: there are a lot of projects out there that increment the version in their pom.xml after a release to distinguish from the current stable release
15:27 poikilotherm And they do so right after a release has been made
15:29 pdurbin poikilotherm: see https://github.com/IQSS/dataverse/pull/5267/files for an example of a release notes file that has already been merged please.
15:30 poikilotherm So you would be happy if I add a PR for this?
15:31 pdurbin poikilotherm: yes, please!
15:31 poikilotherm I could combine it with a separate issue about graphviz :-D
15:31 poikilotherm And a PR for it...
15:31 pdurbin smaller chunks are better but it's a free country
15:31 poikilotherm Ok then I will just create a PR
15:31 poikilotherm No issue for it
15:31 poikilotherm Doesn't make sense...
15:32 pdurbin I never said our process makes sense. :)
15:36 poikilotherm I will just connect it to 4690
15:36 poikilotherm Which is closed, but that's alright
15:36 pdurbin sounds fine, it's an add on
15:41 poikilotherm https://github.com/IQSS/dataverse/pull/5326/files
15:44 pdurbin thanks, approved and dragged to QA
15:45 pdurbin thanks for participating in the new release notes process
15:45 pdurbin better than nothing, I hope :)
15:45 poikilotherm You're welcome :-)
15:48 poikilotherm Alright, that's it for me for now. Maybe I join again later, when the kids are asleep to read up what standup brings to light
15:48 pdurbin thanks again!
16:52 pameyer pdurbin: is 5327 the no-op PID provider issue?
16:59 pdurbin pameyer: yep
17:00 pdurbin pameyer: see also https://github.com/IQSS/dataverse/issues/5024#issuecomment-422445243
17:30 jri joined #dataverse
18:01 pdurbin This is new: https://github.com/gubi/dataverse-php-library
18:24 jri joined #dataverse
19:00 donsizemore joined #dataverse
19:13 poikilotherm joined #dataverse
19:13 poikilotherm Good afternoon guys
19:14 pdurbin welcome back poikilotherm :)
19:25 poikilotherm :-)
19:25 poikilotherm Any news from standup?
19:25 pdurbin buh, let me think
19:26 pdurbin I said you need guidance on if we want to switch to Payara 4.x or not.
19:26 * poikilotherm offers a drink to empower thinking
19:26 pdurbin Matthew says he's going to meet with Gustavo in about an hour to talk about this stuff.
19:27 poikilotherm That doesn't sound too bad
19:28 * poikilotherm crosses fingers they are open for my ideas...
19:28 pameyer reading through the circular dependencies stuff got me thinking about a suggestion a while back of "rewrite everything in django"
19:28 poikilotherm Iiiiiiiih
19:28 poikilotherm That bloody thing again! Take it away!
19:29 pameyer rewriting the world, or django? ;)
19:30 poikilotherm Django is actually not that bad - I just dont like such complex software to be written in Python
19:30 mdunlap joined #dataverse
19:30 * pdurbin summons mdunlap
19:30 * mdunlap is summoned
19:31 pdurbin poikilotherm: mdunlap and I were just talking about Eclipse Glassfish 5.1 (release candidate)
19:31 poikilotherm The daemon is alive...
19:32 poikilotherm Ok, back to serious. About what specific point did you talk?
19:32 mdunlap Generally upgrading off Glassfish 4.1.1
19:33 poikilotherm That sounds like a pretty good idea... ;-)
19:33 mdunlap Totally. We've gone back and forth about Payara and newer Glassfish and sat on the tech debt for a long while
19:35 mdunlap My current opinion is that we should try upgrading to Glassfish 5.1 when its out of RC and have that be our standard if it works out well. If we don't trust Eclipse we are going to be in trouble on a lot of fronts other than Glassfish
19:35 mdunlap That being said Payara isn't that different and has its on merit for sure
19:36 poikilotherm Alright. RC sounds fine for dev purposes anyway. The only "but" I can come up with is the MicroProfile Config API support
19:36 mdunlap I think its likely after moving to 5.1, getting Dataverse working with Payara shouldn't be that wild
19:36 poikilotherm That does not exist with Eclipse Glassfish 5.1
19:37 mdunlap Do you know if its on a roadmap for future Glassfish?
19:37 mdunlap Future Eclipse Glassifh
19:37 poikilotherm They said on Gitter it will be included once MicroProfile is a proper Jakarte EE standard
19:38 mdunlap s/Glassifh/Glassfish
19:38 poikilotherm That could take a few years..
19:39 mdunlap We are already pretty unhappy with the un-open nature of PrimeFaces, and while we don't see getting off that happening shortly, adding more projects with that sort of model goes against a main purpose of our project
19:39 poikilotherm Sure.
19:39 poikilotherm There are alternatives like OpenLiberty, Thorntail/Wildfly
19:40 poikilotherm Are there any real specialties from Glassfish used in the code?
19:40 pdurbin I know we have src/main/webapp/WEB-INF/glassfish-web.xml
19:40 mdunlap If anything our installer is pretty heavily based upon the glassfish model
19:40 poikilotherm Using the MicroProfile APIs would mean stick to (future) standards and not implement against a specific distribution/server flavor
19:41 poikilotherm Yes. AFAIK the installer is actually doing a lot of stuff about configuring the domain, right?
19:41 poikilotherm That would be more or less obsolete once the Config API is in place
19:42 poikilotherm glassfish-web.xml seems to be nothing seriously:
19:42 poikilotherm <?xml version="1.0" encoding="UTF-8"?>
19:42 poikilotherm <!DOCTYPE glassfish-web-app PUBLIC "-//GlassFish.org//DTD GlassFish Application Server 3.1 Servlet 3.0//EN" "http://glassfish.org/dtds/glassfish-web-app_3_0-1.dtd">
19:42 poikilotherm <glassfish-web-app error-url="">
19:42 poikilotherm <context-root>/</context-root>
19:42 poikilotherm <class-loader delegate="true"/>
19:42 poikilotherm <jsp-config>
19:42 poikilotherm <property name="keepgenerated" value="true">
19:42 poikilotherm <description>Keep a copy of the generated servlet class' java code.</description>
19:42 poikilotherm </property>
19:42 poikilotherm </jsp-config>
19:42 poikilotherm <property name="alternatedocroot_1" value="from=/guides/* dir=./docroot"/>
19:42 poikilotherm <property name="alternatedocroot_2" value="from=/dataexplore/* dir=./docroot"/>
19:42 poikilotherm <property name="alternatedocroot_logos" value="from=/logos/* dir=./docroot"/>
19:42 poikilotherm <property name="alternatedocroot_sitemap" value="from=/sitemap/* dir=./docroot"/>
19:42 poikilotherm <parameter-encoding default-charset="UTF-8"/>
19:42 poikilotherm </glassfish-web-app>
19:43 poikilotherm wow, that was BS I wrote above...
19:44 poikilotherm glassfish-web.xml seems not to contain any serious stuff that cannot be done in other app servers
19:44 pameyer the installer glassfish bits (at least if I'm remembering right) also aren't anything that can't be done w\ other app servers - but all the syntax is different
19:44 pdurbin poikilotherm: does Eclipse Glassfish have all the Docker images you want to play with? Payara provides what you want, I thought.
19:45 poikilotherm pdurbin: I don't know yet. But Dockerfiles should be easy.
19:46 pameyer dockerfiles that dataverse will install into and work with are harder :(
19:46 poikilotherm pameyer: actually Config API should make it easy to use simple env vars or other options INSTEAD of using a domain config syntax
19:46 poikilotherm pameyer: this is true for the DB setup too - part of Java EE 7 (or 8?) is an annotation to get around a domain config for the database
19:47 pameyer poikilotherm: I've been kicking around the idea of switching some of the JVM options to env vars
19:47 poikilotherm pameyer: this annotation can be combined with Config API stuff
19:47 poikilotherm pameyer: dont reinvent the wheel - go for Config API :-D Hierarchy for sources included ;-)
19:48 poikilotherm (thus sys props, env vars and many more in parallel is possible)
19:48 mdunlap In the end, we have a lot of admins who are use to the Glassfish way of things so I still think the best next step is to try upgrading to the latest Glassfish and then move towards being more open to a variety of app servers
19:48 pameyer yeah - assuming that the app server supports config api.
19:49 mdunlap There is risk tho with it being the first Eclipse Glassfish release
19:50 poikilotherm Yes, that's a risk, too.
19:50 poikilotherm Please keep in mind that we want to run Dataverse in Kubernetes plus use more Docker stuff for testing at FZJ. We (and others using that) would have a very great benefit of MicroProfile support
19:51 poikilotherm (Others are e.g. DANS/DataverseEU/...)
19:51 poikilotherm (And anything heading for OpenShift)
19:52 poikilotherm Running the installer in Docker images to get things up an running is far from optimum
19:52 poikilotherm IMHO it makes the installer even less maintainable - currently there are already a lot of if statements to detect container usage
19:53 pameyer @poikilotherm running the installer in the container is something that can be avoided with smallish redesign
19:53 mdunlap Are there any "truly" free application servers we could look to that support MicroProfile?
19:53 pameyer doesn't sort out the config ugliness though
19:55 poikilotherm I heard many good things about Thorntail/Wildfly and OpenLiberty
19:55 poikilotherm OpenLiberty seems to interesting with its modular concept
19:56 poikilotherm Willing to evaluate on this
19:56 mdunlap I'm taking a look now myself
19:57 pameyer two silly questions
19:58 pameyer will getting off gf 4.1.1 get dv "unstuck" from particular app servers/app server versions?
19:59 pameyer is there a step in the right direction that won't break everything?
20:00 pdurbin It's technically Oracle Glassfish 4.1 that we're stuck on, by the way. Not 4.1.1. Sorry to split hairs.
20:00 poikilotherm shortly AFK
20:00 pameyer pdurbin: splitting hairs here is good, especially if I'm putting incorrect details in ircbot's memory
20:01 mdunlap I did the same :P
20:01 pdurbin :)
20:02 pameyer even better, possible explaination for why oracle glassfish 4.1 -> payara 4.1.2 wasn't happy
20:02 mdunlap @pameyer I feel that GF 5.1 is a step in the right direction, will get us away from some of the manual fixes and library hairiness we have. But its arguably a small and somewhat pointless step.
20:02 pdurbin pameyer: I think getting off 4.1 is at least somewhat orthogonal to supporting various app servers. And yes, let's not break things. :)
20:03 pameyer upgrading infrastructure components isn't usually something I want to have be exciting
20:04 mdunlap When I looked into upgrading to Glassfish 5.0 it didn't seem like that big of an effort either, so I'm hopeful 5.1 also won't be too painful
20:08 pameyer in an ideal world, the app servers would be converging towards supporting the same standard.  so there's some probability that gf 5.1 -> payara 5.1 (w\ the microprofile config bits that poikilotherm's looking for) would be workable
20:09 pdurbin right, and that standard is whatever comes after Java EE 8
20:09 pameyer ... and that standard isn't finalized?
20:09 pdurbin I don't know if that standard has a name yet, but I assume it will be Jakarta EE 9 or something.
20:11 pdurbin You can see the uncertainity of the name as "When will there be a Jakarta EE [ 9 | 1.0 | 2019.MM.DD ] release" for example from https://jakarta.ee/about/faq/
20:12 pameyer hopefully the backwards compatable parts of that covering ee8 would be closer
20:12 pdurbin not sure what you mean
20:13 pameyer that even if whatever ee9 is called, it'll be backwards compatabile with previous ee8,ee7, etc standards.
20:13 pameyer assuming that they'll have backwards compat, based on how jvm seems to handle it
20:14 pameyer am I remembering correctly that fixing things for gf5 / gf5.1 would be likely to break things for gf 4.1?
20:16 pdurbin Oh, yes Java EE 8 is backward compatible with Java EE 7 and 6. I assume this won't change. I can't think of any Java EE APIs that have been removed.
20:18 mdunlap @pameyer likely yes. We could probably leave things for 4.1 compatibility but I don't know if that's a thing we should do
20:18 mdunlap leave/expand
20:19 pameyer thanks - I'll go back to trying to being quiet and let you and poikilotherm get back to it
20:20 poikilotherm Re
20:22 poikilotherm Well, Glassfish 4.1 will not be compitable with anything I fear...
20:22 beddari joined #dataverse
20:23 poikilotherm That is a really old distribution of things, starting from Jackson 2.3 over Jersey 2.10, no Java EE 8 support etc.
20:23 poikilotherm All those old deps are hurting us right now
20:23 poikilotherm E.g. AWS want Jackson 2.6 at least
20:24 poikilotherm You need that manual patching etc
20:24 poikilotherm The only up to date release of Glassfish 4.1 exists in the drop-in replacement Payara 4.1, which in turn is not updated if you have no subscription
20:25 poikilotherm Staying compatible with Glassfish 4 is IMHO not possible
20:25 poikilotherm And IMHO it is not even necessary
20:25 poikilotherm Call it Dataverse 5 and you are good to go, making an incompatible change
20:25 poikilotherm That's just life, software changes and gets old
20:26 poikilotherm There of course needs to be a migration path etc etc etc
20:26 mdunlap Yea that's my feel too
20:27 pdurbin poikilotherm: are you blocked from going into production until all this is sorted out?
20:27 poikilotherm I understand that you guys don't feel well with going for a subscription based app server like Payara
20:27 poikilotherm I can't tell if this really is such a big risk as we are discussing here
20:28 poikilotherm And of course what could be the worst things that might happen for services used for RDM (as long as there is no sensitive data in it)
20:29 poikilotherm pdurbin: more or less, yes, that's a blocker
20:29 poikilotherm We consider setting up a temporary showcase based on current stuff, but that shouldn't be our production instance
20:29 poikilotherm Torsten and I agreed that we really should have better microservices at hand before going in production
20:30 pdurbin poikilotherm: ok, I'm just thinking that all these changes might take a while
20:31 poikilotherm IMHO this should be about 1-2 months once everyone is on the same boat
20:31 pdurbin Do you think we could get a grant for it?
20:32 poikilotherm I am not very experienced in grants, but that definitly is worth a try.
20:32 poikilotherm Its a major overhaul of the infrastructure and makes Dataverse future ready
20:32 poikilotherm It even will pave the path to more robustness
20:32 poikilotherm Easier testing, automation etc
20:33 poikilotherm You know my vision about integration testing, right? ;-)
20:33 pdurbin yep
20:33 poikilotherm And kcondon seemed to be interested in Selenium E2E tests...
20:34 poikilotherm This will definitly need something else than docker-aio... The current Jackson problem would not have occured with that, you need all stuff in place for proper E2E testing
20:35 poikilotherm This will of course not be present within a month or two
20:35 pdurbin I suspect that the REST Assured tests may have caught the Jackson bugs, but we run them after the code is merged. :(
20:35 poikilotherm Nope. The exception was bitting us when he clicked in the GUI and the S3 file backend kicked in
20:35 poikilotherm S3 is not part of docker-aio...
20:36 pdurbin ah, I missed that it was a GUI only bug, not exercisable via API
20:36 pameyer s3 is also not part of iqss jenkins
20:36 poikilotherm MAYBE the API could trigger such an exception too (unsure about the code), but definitly not without S3 configured and used
20:37 poikilotherm Sure - but firing up a Docker container with Minio in it could solve that easily, couldn't it?
20:37 mdunlap joined #dataverse
20:38 poikilotherm And for me IQSS Jenkins is no authority... :-D (I know, easy to say for me...) But there are options to change or combine things, too
20:39 poikilotherm But that really is the far future
20:39 beddari hi, Jan from Safespring here, we're a compute/ceph S3 storage service provider for various academic institutions in the Nordics. Been browsing around your design work around S3 for a while ...
20:40 poikilotherm First things first - and for now our greatest defect is the ability to have small and proper Docker containers, easily configurable
20:40 pdurbin hey there beddari
20:40 poikilotherm (our = FZJ)
20:40 pdurbin poikilotherm: right, that's why I brought up the Docker images from Payara. Will you get what you need from Eclipse Glassfish?
20:41 * poikilotherm goes fishing at Docker Hub
20:41 pdurbin heh
20:41 beddari :P
20:41 pdurbin beddari: how can we help you? :)
20:44 beddari well, you can't really, I'm trying to determine how mature Dataverse is versus a few use cases we got from customers. The 'closest' dataverse usage to us is probably https://dataverse.no/. I'm CTO at Safespring and well my guess would be to hold off recommending (less funded) people start trying to use our S3 service with this ... yet.
20:44 beddari Really interesting project though =)
20:46 pdurbin beddari: you're in luck that poikilotherm is here because he recently made it so that Dataverse works not only with real Amazon S3 but also non-Amazon S3 clones. That's what your S3 service is, right? A clone? Compatible with Amazon S3?
20:46 beddari yeah, we saw that PR on github, which caught our attention
20:46 pdurbin cool
20:46 * poikilotherm grins from ear to ear
20:47 pdurbin beddari: maybe you could add "should work with Dataverse" to your website :)
20:48 poikilotherm Or even better: I started adding a "known working" section in the docs for Dataverse. Maybe test it with your service and report in the docs?
20:48 poikilotherm Currently I only tested it with Minio
20:48 pameyer with the dcm-s3 work, we considered using minio for the integration tests for the dcm.  that ended up getting dropped; partly because we're not using any object stores
20:48 beddari sure, ok. We're on Ceph radosgw, like many companies/edu networks in Europe
20:48 poikilotherm We have plans to use it with CEPH RADOS GW, too, but this is not ready yet at FZJ.
20:49 beddari for ceph radosgw CI testing you should use https://github.com/ceph/cn
20:49 poikilotherm Sounds good :-D
20:50 poikilotherm Thanks for pointing out, I only knew of Rook
20:50 beddari I was hoping though ... around the design, that the right people would understand the need to separate the data storage layer from the application. E.g point at different S3 stores etc.
20:51 pdurbin I *think* there's enough separation. I hope so.
20:51 beddari we are working with a portal that enables reserachers to have access to a large set of separate S3 services (access key, secret key, buckets) s
20:51 beddari I hope so too :)
20:52 pdurbin If there isn't enough separation, please open a GitHub issue. :)
20:52 beddari I'd be interested to learn more about the design somehow wrt how to talk about it to my customer organizations. As they might be able to fund dev work. But I haven't found much design docs =) (I know how this is hehe)
20:52 beddari I saw the discussions in a google doc around S3 large file downloads etc
20:53 poikilotherm pdurbin: there is no public Docker stuff for Eclipse Glassfish 5.1 right now
20:53 beddari and I've started to learn about the data import component
20:53 pdurbin poikilotherm: bah. Sounds like a blocker.
20:53 poikilotherm pdurbin: and I name it again. We. Want. Config API.... It makes it a lot less painfull to dockerize
20:54 pdurbin poikilotherm: I can hear Gustavo and Matthew talking down the hall. I hope they know this. :)
20:56 poikilotherm Is mdunlap tapped? (you summoned him...)
20:56 pdurbin beddari: so you would say something like "We're pretty sure our product is compatible with data repository software called Dataverse"? It would be good to test it first. We can help you install Dataverse. Funding dev work is always appreciated. :)
20:57 pdurbin poikilotherm: he has his laptop. Maybe he's reading this.
20:58 beddari pdurbin: we're not in a hurry, but yes, great plan. Our largest current customer that are interested in this is SUNET, the swedish NREN. They haven't designed their goals yet, but I've pointed them at Dataverse ...
20:59 pameyer @beddari there's been some effort to have the storage level configurable, because different installations are using (or planning to use) different storage
20:59 pameyer but prior to poikilotherm's work, s3 has been considered aas a single one of those
20:59 beddari pameyer: so my main concern with what I've seen (so far) vs S3 would be if it scales ...
20:59 pdurbin beddari: great! Thanks! Please tell them they're welcome to chat with us here or join our every other week community calls: https://dataverse.org/community-calls
21:01 pameyer @beddari do you have ballpark estimates for what level of scaling you're looking for?
21:01 beddari pdurbin: good idea with the community calls, thanks
21:02 pdurbin beddari: the mailing list is a good place to ask questions too: https://groups.google.com/forum/#!forum/dataverse-community
21:02 beddari pameyer: well, hard to say, I just know they produce _a lot_ of data. We sell on premise Ceph S3 storage starting from 2PB clusters ...
21:03 beddari pameyer: how much data they want to store/publish in a solution like this I really don't know yet.
21:04 beddari pameyer: my worry is more like that I know they are used to "thinking in filesystems" and well ... S3 / object storage in general is quite different after all.
21:05 pdurbin beddari: ah, you might like the new "big data" mailing list: https://groups.google.com/forum/#!forum/dataverse-big-data
21:05 pameyer beddari: yeah, I've run into that impedence mismatch too.
21:05 beddari for us as a S3 operator e.g interesting questions are ... what the object/file size in general would be produced by dataverse
21:06 beddari all the time hehe
21:06 pameyer that, and folks having "big" data with no numbers for storage and number of files
21:07 beddari hehe yeah they know it is big all right, but not much else is known :P
21:08 pameyer :)
21:09 beddari Agenda
21:09 beddari * Review efforts underway by the Harvard IQSS Team
21:09 beddari * rsync
21:09 beddari * Large number of files, large files, direct access to compute
21:09 beddari * POSIX storage, not object
21:09 beddari someone took my notes
21:09 beddari ;)
21:09 beddari but thanks all, I have some more info to digest/reflect/direct
21:10 poikilotherm You're welcome :-)
21:11 beddari ah one question while I remember it (and fade into the background here), what is the design decision (seemingly one that was made?) of not allowing direct recursive download off S3?
21:11 beddari e.g by using rclone or any other S3 capable tool?
21:12 pdurbin beddari: isn't there an S3 command called "sync" or something?
21:13 pdurbin It looks like I wrote about it here: https://github.com/IQSS/dataverse/issues/4949#issuecomment-422479106
21:14 beddari sync is just a wrapper for "the CRUD operations needed to compare this with that and move the required bits to be in sync"
21:15 pdurbin oh
21:15 pdurbin I don't know S3 very well. I've barely used it.
21:15 beddari https://rclone.org/ <- rsync for the cloud hehe
21:16 pdurbin cool, bookmarked. thanks
21:16 beddari pdurbin: S3 really is "simple storage service", it is mostly PUTs and GETs of blobs, and ... that is basically it.
21:17 beddari of course there's more :P but you could assume most all of it, and that would be correct hehe
21:17 pdurbin :)
21:19 pameyer beddari: I'd assume that it was because rclone probably can't interoperate with the dataverse permission system
21:19 pameyer but it might also have been because it wasn't something that anybody thought of ;)
21:19 beddari unauthed "sync" of S3 is indeed possible and normal
21:20 beddari anonymous access, or "public S3 buckets" as it is mostly called I guess
21:20 pameyer yup - that part's relatively easy
21:20 pameyer but dataverse supports "restricted" files too, which usually need some level of auth
21:21 beddari I see :) and we got some legacy folder thoughts ;) lurking hehe
21:21 beddari (my guess)
21:21 beddari "With our current implementation we'd also have to separate the unpublished and restricted files from the s3 bucket."
21:22 beddari this is where a lot of "easy ports" to S3 get caught up yes
21:22 beddari because ... working with per path access/auth etc quickly becomes tedious
21:24 xarthisius joined #dataverse
21:28 pdurbin beddari: there's some talk of Ceph and ACLs at https://github.com/IQSS/dataverse/issues/4071 but I can't really follow that issue very well. This may just be noise. :)
21:29 beddari hmm I think wrapping data in tar/zip is a mistake ... but that is just my gut feeling as an engineer, I'm not in any way educated wrt publishing ...
21:29 beddari thanks I'll try to learn more from that pdurbin  :)
21:30 pameyer beddari: I'm not much of a fan of the tar/zip bit either - but that's mainly because my first reaction is thinking about compute access
21:30 pdurbin beddari: tar/zip is a workaround because Dataverse doesn't support file hiearchy. Except for pameyer's rsync stuff. It's complicated. More at https://github.com/IQSS/dataverse/issues/2249
21:30 pameyer and most of the folks in the conversations are usually not too worried about it
21:31 beddari well it forces more on the consumer end than what is needed :) imo  (e.g you can't directly reference files as they are packaged)
21:32 beddari if they don't worry I'm thinking it is because they don't know the potential hehe, but I'm going to ask some of the SUNET metadata schooled people about that
21:33 pameyer compute pipelines are usually complex enough without introducing more indirection in the io
21:34 beddari absolutely
21:34 poikilotherm pdurbin: did you ever feel lucky and fire up Dataverse in lets say - Wildfly or similar?
21:34 pameyer and starting off with "download and decompress this zip" works fine up to a certain size - then it gets unhappy
21:35 beddari pameyer: indeed, related to my worry earlier
21:36 beddari pameyer: I would be the convoyer of that unhappines when our provided drives, compute nodes and storage choke :P Must avoid.
21:38 beddari again, really appreciate this 'tour of related issues', saving me tons of time. now afk, sleeping soon :)
21:40 pdurbin poikilotherm: I don't know how to configure a database for Wildfly. So no. Hellow world seems to work fine on wildfly, which I tested at https://github.com/pdurbin/javaee-docker :)
21:49 poikilotherm :-D
21:49 poikilotherm I think of just getting this a try...
21:49 poikilotherm To gain some knowledge how much pain could be involved
22:20 pdurbin poikilotherm: sounds like an interesting experiment
22:20 poikilotherm pdurbin: I just got aware of the release policies of WildFly and OpenLiberty...
22:20 poikilotherm OpenLiberty does not have one, no word about sec updates
22:20 poikilotherm And Wildfly is based on quarterly releases like Payara
22:21 poikilotherm I could not find anything about Eclipse Glassfish and their plans for releasing
22:22 poikilotherm And those guys seem not to have a special gitter room
22:25 poikilotherm But as the people beyond this project are from redhat (wildfly), Payara and Oracle I somehow disbelieve in faster releases
22:27 poikilotherm Why would they do so - they work for those who offer paid subscriptions for faster releases
22:30 pameyer well, patches are usually relatively quick to make their way to centos
22:32 poikilotherm Sure. Sry, I don't get it - how does CentOS and its inheritance from RHEL correlate to Java EE App releases?
22:33 pameyer_ joined #dataverse
22:33 poikilotherm paymeyer: Sure. Sry, I don't get it - how does CentOS and its inheritance from RHEL correlate to Java EE App releases?
22:34 poikilotherm Oh +Server = App Server
22:34 pameyer_ same company as wildfly, right?
22:34 poikilotherm (sry, its late over here...)
22:34 pameyer_ ... good reason to call it a night, right?
22:34 poikilotherm Yes, but that's an entirely different story...
22:35 pameyer_ true - not sure how much correlation there'd be between centos/wildfly releases
22:35 poikilotherm CentOS has been a community project and it based on the need of RedHat to publish GPL and other OSS stuff right away
22:35 pameyer_ and I tend to worry more about patches than releases
22:35 pameyer_ good point
22:35 poikilotherm http://lists.jboss.org/pipermail/wildfly-dev/2017-December/006250.html
22:36 poikilotherm And Redhat offers JBoss as does Payara with its Glassfish fork
22:41 poikilotherm Alright pameyer_ and pdurbin, let's call it a day. 23:40 over here, need to get some sleep. I will be offline tomorrow, learning some <sarcasm>totally interesting</sarcasm> CMS stuff.
22:41 poikilotherm Have a good evening :-)
22:42 pameyer_ have a good night @poikilotherm - good look w\ the CMS ;)
22:45 pameyer_ s/look/luck/g
22:46 pdurbin pameyer_: I'm with you on the centos or ubuntu analogy. I pick these distributions because I get free security patches.
22:49 pameyer_ yeah - patches are nice

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.