Time
S
Nick
Message
01:38
jri joined #dataverse
04:38
jri joined #dataverse
10:51
pdurbin joined #dataverse
13:42
pameyer joined #dataverse
13:52
donsizemore joined #dataverse
13:59
donsizemore
@pdurbin knock knock
13:59
pdurbin
Who's there?
14:00
pdurbin
icarito[m] knows I was telling jokes in #publiclab on oftc over the weekend :)
14:01
* donsizemore
runs to go look up knock-knock jokes
14:03
donsizemore
so, if i wanted to get hold of someone with access to your production web node filesystem, am i allowed to do that?
14:03
pameyer
now I'm wondering about port-knocking for remote logins, and procedural generation of knock-knock jokes
14:04
donsizemore
(is that Kevin now that you're on AWS?)
14:04
pdurbin
donsizemore: this is a good question for support dataverse.harvard.edu
14:04
pdurbin
maybe pameyer will give you access to his server ;)
14:05
donsizemore
oh, i have a ticket open
14:05
pdurbin
oh! what's the number please?
14:05
pameyer
I don't have anything running on AWS at the moment
14:06
donsizemore
but we've opened, and are holding open, ~1600 export_* file descriptors over the weekend. i've got my eye on me not having to restart dataverse every so often because systemd doesn't want to seem to set our max ulimit above 16384
14:06
donsizemore
@pdurbin #261403
14:07
donsizemore
@pameyer how many export_ files do you see in /proc/<gfish pid>/fd ?
14:07
pdurbin
donsizemore: thanks, this ticket came in over the weekend. I can mention it at standup in an hour.
14:10
pameyer
donsizemore: I'm seeing ~600-700; but that may be traffic dependent
14:10
donsizemore
@pdurbin the files appear to be well-formed, as best i can tell. akio is comparing changes to the harvesting and ingest code between 4.7.1 (april 17th) and 4.8.6 (april 27th)
14:10
donsizemore
@pameyer i'm not above a cronjob to restart the service every so often, i ain't too proud to beg
14:11
donsizemore
@pameyer 600-700 total? what version of dataverse are you running?
14:11
pameyer
I'm suspicious about resource leaks - do these look like OAI exports
14:11
pdurbin
A cron job to restart Glassfish? We have a script the bounces Glassfish as necessary.
14:12
pameyer
donsizemore: our currently deployed fork is based off 4.8.1, but it's got a mix of 4.8.3 and fork stuff in it too
14:12
pameyer
if I'm remembering right, your on 4.8.6 - so what I'm seeing might not be too informative
14:13
donsizemore
they're export_oai_schema and export_oai_dc mostly. we've got some held open from Friday when I bounced Glassfish
14:14
donsizemore
@pameyer we open around 600 FDs just launching Glassfish, so you sound normal
14:19
donsizemore
@pdurbin note that our last harvesting client run (against Harvard) still says Last Run: Sun Apr 29 03:00:00 EDT 2018INPROGRESS so that may be related. but harvesting always got stuck before without us falling over
14:19
pameyer
donsizemore: yeah, both of those sound like harvesting related
14:23
donsizemore
@pameyer however, on our production-test server (same version, production DB dump) the harvesting settings are the same and there are no export file lying around in /proc
14:25
pameyer
@donsizemore: clone of the production VM (OS version, packages, etc), right?
14:26
donsizemore
@pameyer eh, parallel stand-up
14:27
pameyer
np
14:27
donsizemore
@pameyer but not a VM clone, which would be a great test
14:29
pameyer
@donsizemore: I'm wondering about possible minor version differences - but that might be because I'm scratching my head, and don't have any great ideas
14:30
donsizemore
@pameyer they're both RHEL7, PG93, they did recently pick up the 1.8.0_171 JDK tho
14:32
pameyer
anything interesting in the export_* or oaiSetsUpdate_* logs from glassfish?
14:35
donsizemore
i'm just correlating timestamps. a bunch of them popped up at 0942 today
14:39
donsizemore
@pdurbin akio found a suspicious bit of code not limited to harvesting. any time a dataset is updated AND the storage is local rather than remote, there's an open without a close
14:43
pameyer
@donsizemore I think there may be a bunch of thoese
14:43
pameyer
:(
14:44
pameyer
ExportService?
14:44
donsizemore
https://github.com/IQSS/dataverse/blob/6dd1fcb5a5e3755a40eedd3d6755e9ce2917f57d/src/main/java/edu/harvard/iq/dataverse/export/ExportService.java
14:46
pdurbin
donsizemore: so some sort of leak? leaking resources? file descriptors?
14:51
donsizemore
@pdurbin if i followed him correctly, it tries to open one outputstream for swift, then if (!tempFileRequired) it opens a different outputstream and closes that one, but then doesn't hit the else { stanza to close the first one
14:51
donsizemore
@pdurbin smarter e-mail from akio forthcoming
14:57
pdurbin
donsizemore: why did he send it to my personal email address?
14:59
donsizemore
jon asked him to include IQSS in the loop on his troubleshooting. i'll append it to my ticket?
15:00
pdurbin
Yes, please put the information in the ticket. Thanks!
15:59
andrewSC joined #dataverse
16:01
andrewSC joined #dataverse
16:35
icarito[m]
I completely missunderstood what dataverse was
16:35
icarito[m]
I had my mind on datproject (the p2p protocol)
16:37
icarito[m]
it's about archiving data sets
16:37
pameyer
content-addressable storage, right?
16:39
pdurbin
icarito[m]: yep, Dataverse is about archiving datasets. And making them citable (with a DOI), explorable, etc.
16:39
icarito[m]
I guess it's about open data and long term archival and reference
16:39
pdurbin
yep, but the data can be restricted if necessary. but yeah CC0 by default
16:40
icarito[m]
love to find stuff like "Centro de Investigacion de la Salud Indigena Dataverse"
16:41
pdurbin
ah, you must mean https://dataverse.harvard.edu/dataverse/CISI
16:42
icarito[m]
yeah! love this they do surveys and workshops in native language
16:43
icarito[m]
I was part of the team that helped put quecha, aymara and awajun into glibc
16:43
icarito[m]
and localize Sugar
16:43
icarito[m]
so much knowledge in those dying tongues
16:45
pdurbin
nice
17:43
pdurbin
pameyer: you're still interested in the JMeter stuff, right? I'm watching a video from the students that discusses it.
17:45
pameyer
pdurbin: still interested, but not in a great spot for video at the moment
17:46
pdurbin
I'm taking screenshots.
17:52
pdurbin
pameyer: I added them to https://github.com/IQSS/dataverse/issues/4201
17:59
pameyer
pdurbin: thanks, those look interesting
18:00
pameyer
looks ballpark in the range of what I was seeing with ab (extrapolated to 2 servers)
18:00
pameyer
^ throughput, I mean
18:10
pdurbin
Yeah? The only tests I've done were with Locust: http://guides.dataverse.org/en/4.8.6/developers/testing.html#locust . And have no idea what the throughput was.
18:14
pameyer
I wasn't being too systematic about it :(
18:27
pdurbin
pameyer: there's stuff about file descriptors in the init script at http://guides.dataverse.org/en/4.8.6/installation/prerequisites.html#launching-glassfish-on-system-boot by the way. I was just looking at the ticket Don mentioned earlier.
21:14
dataverse-user joined #dataverse
21:14
dataverse-user
I have unauthorised charges to my bank account, how can we clear this up?
21:42
pameyer left #dataverse