IRC log for #dataverse, 2020-03-24

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

All times shown according to UTC.

Time	Nick	Message
06:15		nightowl313 joined #dataverse
08:52		jri joined #dataverse
09:07		Benjamin_Peuch joined #dataverse
09:29		Benjamin_Peuch joined #dataverse
09:31		Benjamin_Peuch67 joined #dataverse
09:31		Benjamin_Peuch joined #dataverse
09:36		Youssef_Ouahalou joined #dataverse
10:04		Youssef_Ouahalou joined #dataverse
10:14		jri_ joined #dataverse
10:35		Benjamin_Peuch joined #dataverse
13:50		nightowl313 joined #dataverse
13:52	nightowl313	hi all ... we are finally beginning our dataverse pilot; i recently moved the location of files to s3; I physically moved the files to the s3 bucket, and I changed the related jvm options as indicated in the instructions, but do I need to also make a change to the file location in the database?
13:52	nightowl313	I'm getting 500 errors when trying to use any of the "Cite Data File" options
13:55	nightowl313	the other file functions work fine, I can view and explore the file, but wondering if it still seeing the local file (did not remove)
13:56	nightowl313	or, is there a way to remove/reinstall the sample data? (that's all that we have in there so far)
14:22		donsizemore joined #dataverse
14:47	donsizemore	@nightowl313 I confirmed with Leonid, you'll want to update the storageidentifiers: 'if the current storageidentifier is "xyz" (or "file://xyz"), it should become "s3://theirbucketname:xyz"'
15:14		nightowl313 joined #dataverse
15:18	donsizemore	@nightowl313 I checked our test DB with this query: select storageidentifier from dvobject where dtype='Dataset';
15:19	donsizemore	@nightowl313 as best I can tell the dvobject storageidentifier for dtype DataFile is just the filename on disk
15:19	nightowl313	@donsizemore do I update those identifiers directly in the database (ie: run a query)?
15:20	nightowl313	sorry posted at the same time as you
15:20	donsizemore	Jim mentioned a query to do this, but I didn't find one on guides.dataverse
15:20	nightowl313	not sure the best way to update
15:20	donsizemore	to see what you've got, do
15:22	donsizemore	select storageidentifier from dvobject where dtype='Dataset';
15:22	donsizemore	or better yet select distinct storageidentifier from dvobject where dtype='Dataset';
15:23		nightowl31395 joined #dataverse
15:23	donsizemore	though storageidentifier includes the dataset id... lemme find that SQL query
15:25		nightowl313 joined #dataverse
15:27	nightowl313	trying to connect to the db ... moved it to rds ... argh
15:30		pkiraly joined #dataverse
15:30	nightowl313	it shows 0 rows
15:41	donsizemore	@nightowl313 Leonid is sending me the queries he used to migrate Harvard from local storage to S3
15:42	donsizemore	@nightowl313 though I might like to break for lunch? I can e-mail you in a bit?
15:43	nightowl313	sure no problem... thanks!
15:43	donsizemore	I hope to move Odum some day so I want these queries as well =)
15:44	nightowl313	i appreciate the help!
15:46	donsizemore	@nightowl313 Leonid asks if you preserved the pseudo-folder structure when you copied the files into S3
15:48	nightowl313	yes i just copied the entire local folder under /usr/local/dvn/data to the bucket
15:49	nightowl313	drag/drop and it created the "structure"
15:49	nightowl313	i think
16:18		nightowl313 joined #dataverse
16:53		pkiraly joined #dataverse
17:20		donsizemore joined #dataverse
17:46		yoh joined #dataverse
17:50	donsizemore	@nightowl313 from Leonid: UPDATE dvobject SET storageidentifier='s3://your-bucket-name:' \|\| storageidentifier WHERE id in (SELECT o.id FROM dvobject o, dataset s WHERE o.dtype = 'DataFile' AND s.id = o.owner_id AND s.harvestingclient_id IS null AND o.storageidentifier NOT LIKE 's3://%')
17:52	donsizemore	@nightowl313 he adds "it would be prudent to add AND o.storageidentifier NOT LIKE 'file://%'"
17:54		thibbo joined #dataverse
18:16		pkiraly joined #dataverse
19:32		donsizemore83 joined #dataverse

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.