IQSS logo

IRC log for #dataverse, 2021-05-20

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

All times shown according to UTC.

Time S Nick Message
06:08 Virgile joined #dataverse
07:17 Virgile joined #dataverse
08:08 nightowl313 joined #dataverse
08:09 nightowl313 if anyone is around .. i am going through the upgrade of dataverse and am stuck on the update of the solr schema ... it will not update because solr is on a different server from the dataverse web application server (i assume)
08:10 nightowl313 just getting errors that the connection is refused ... tried changing the dataverse url to the url of the webserve
08:13 dataverse-user joined #dataverse
08:14 nightowl313 upgrade to 5.4 that is
08:15 nightowl313 the final step to upgrade the scheme .. since solr is on a separate server the script is looking locally for solr
08:15 nightowl313 or well for dataverse that is
08:40 nightowl313 joined #dataverse
11:31 donsizemore joined #dataverse
13:43 dataverse-user joined #dataverse
13:45 dataverse-user i have a question. I have made a customized metadta and loaded it in Dataverse. Now I wuld like to delete them. Suggestions are welcome
14:03 pdurbin joined #dataverse
14:04 VJ joined #dataverse
14:08 VJ joined #dataverse
14:10 dataverse-user or alternatively Can I remove some schemas from http://localhost:8080/api/admin/index/solr/schema
14:14 pdurbin dataverse-user: it's complicated. Please see https://groups.google.com/g/dataverse-community/c/1M4bULWlKDk/m/ThqCgir_AAAJ
14:44 dataverse-user its better not to mess with datafields, i guess
14:45 dataverse-user as I am more into c++ and a beginner to dataverse, is there any possibiltiy of using c++ with dataverse
15:11 pdurbin dataverse-user: hmm. Do you find yourself calling into REST APIs from C++? If so, you could build a C++ library to call into Dataverse APIs. :)
17:40 dataverse-user thanks Pdurbin..
17:41 pdurbin sure :)
17:41 pdurbin Did you have something else in mind? :)
17:41 dataverse-user can you also please help me in solr
17:41 pdurbin I can try.
17:42 dataverse-user the instanceDir and DataDir for solr(default)
17:42 dataverse-user path for dataverse
17:43 dataverse-user I messed up my core  at localhost:8983 so i am trying to add them ( in my dev env)
17:45 pdurbin Oh. If I ever mess up Solr in dev I just move it aside and reinstall it. That's what I'd suggest.
17:46 dataverse-user I can see the dataverse and datasets but not together :(
17:51 dataverse-user yeah dont it
17:51 dataverse-user *i did it
17:51 pdurbin You reinstalled Solr?
17:52 dataverse-user some path correction... and boom it worked as before
17:52 dataverse-user AND this page https://guides.dataverse.org/en/4.20/admin/solr-search-index.html
18:05 pdurbin Sure. You had to reindex.
18:05 pdurbin Glad it's working now.
18:06 pdurbin What's the dev environment for anyway? You're hacking on something? :)
18:06 nightowl313 joined #dataverse
18:40 dataverse-user for some research purpose :)
18:48 pdurbin a mysterious research purpose :)
19:41 bjonnh pdurbin: https://github.com/bjonnh/molinfo
19:41 bjonnh I made that
19:41 bjonnh if you ever need to display molecules
19:41 bjonnh there is a public instance at https://mol.nprod.net/molecule/etc
19:42 bjonnh I have no idea how it is going to scale, but for now it is pretty fast
19:53 pdurbin wow, Kotlin
19:54 bjonnh yes I'm doing everything in Kotlin these days, but I'm also lurking at Scala and bare java
19:55 pdurbin Did I tell you my dad worked at CAS for 40 years?
19:56 bjonnh oh so you know everything about the mess chemistry is in :D
19:56 bjonnh it is great CAS finally accepted for Wikidata to have CAS numbers
19:57 pdurbin That's good. I hadn't heard of InChIKey.
20:00 pdurbin I'm getting a 404 from the public instance but I'm wondering if you can make a Dataverse external tool out of this. I'm not sure what the MIME type would be thought. Which type of files would give this SMILES output.
20:03 bjonnh oh yeah you need to do a real link
20:04 bjonnh https://mol.nprod.net/molecule/smiles/CCCOCCC.svg
20:04 bjonnh like that
20:04 bjonnh for now CORS allows to use it from anywhere
20:04 bjonnh but I'm probably going to limit at some point if I see too much use
20:04 bjonnh I can put dataverse instances in there
20:04 bjonnh so the service does smiles 2 svg
20:04 bjonnh smiles 2 inchikey
20:05 pdurbin Or THIS: https://mol.nprod.net/molecule/smiles/CCC(C)N1C(=O)N(C=N1)C2=CC=C(C=C2)N3CCN(CC3)C4=CC=C(C=C4)OCC5COC(O5)(CN6C=NC=N6)C7=C(C=C(C=C7)Cl)Cl.svg
20:05 bjonnh curl https://mol.nprod.net/molecule/smiles/CCCOCCC/inchikey
20:05 pdurbin looks very nice
20:05 bjonnh the server is in Europe so it is a bit slowish
20:05 bjonnh I will host an instance here in Chicago
20:06 bjonnh I need to see if I want to connect the Redis together
20:06 bjonnh maybe not, that sounds like trouble
20:06 pdurbin Do people deposite .smi files? Is that a thing?
20:08 bjonnh that could be a thing
20:08 bjonnh or in metadata
20:08 bjonnh and you could use the service to display the structure from the string
20:08 pdurbin Well, Dataverse external tools operate on files.
20:09 bjonnh yeah but here you just have to <img src="https://mol.nprod.net/molecule/smiles/{smilesstring}.svg">
20:09 pdurbin Yeah, that's nice.
20:10 bjonnh we are also working on an external service for NMR visualization
20:10 bjonnh that could be integrated with dataverse as well. I'll bother you once we'll start thinking about that part
20:10 pdurbin Man, where's pameyer when we need him.
20:12 pdurbin I pinged him in Slack. He's our resident scientist.
20:12 pdurbin (We have plenty of social scientists.)
20:14 pdurbin Please take your time with your integrations but we do have our annual community meeting coming up in about a month if you have anything you want to show.
20:15 bjonnh ok I'll think about something
20:16 pdurbin Great. 7th annual if you can believe it.
20:17 bjonnh oh I thought it was much older than that
20:23 pameyer joined #dataverse
20:24 pameyer pdurbin: you rang? :)
20:24 pameyer bjonnh: I think dataverse is older than the community meetings
20:24 pdurbin pameyer: did I ever tell you bjonnh is a chemist? (I think.) And that he's building neat tools?
20:25 pameyer I had a vague idea that bjonnh was into nmr/chemistry type things
20:26 pameyer still getting caught up on the logs; but https://github.com/bjonnh/molinfo looks interesting
20:27 pameyer may have some overlap with pubchem APIs
20:29 pdurbin pameyer: we should make your molecule viewer into an external tool. You'd need to deposit an extra file, though.
20:29 pameyer one thing that might be a factor for integrations/external tools is urlencoding for SMI strings
20:30 pdurbin Hmm, but could package files have auxilary files? I think the plan is for external tools to support aux files some day.
20:30 bjonnh o/
20:30 bjonnh yes I'm a natural product chemist
20:31 bjonnh pdurbin: pameyer: last big work: https://lotus.nprod.net/
20:31 bjonnh we posted millions of facts on wikidata
20:32 pameyer nice!
20:32 bjonnh pameyer: yes it has to be urlencoded for my service to work properly
20:32 bjonnh especially since many smiles have slashes in them
20:32 bjonnh I also integrated it in google sheets
20:32 bjonnh so now we can display structures in there as well
20:32 pdurbin bjonnh: typo: "ressource" but yes, this is nice.
20:33 bjonnh that's the french way to write it…
20:34 pameyer if an external tool is pulling SMI from a file, I don't think there would be a problem.  but I don't _think_ external tools do url encoding - although I'm out of date enough on improvements that I could very well be wrong
20:34 bjonnh I could make it take URLs from dataverse yes
20:34 bjonnh I just need to make sure it only hits dataverse and only the right pages
20:35 bjonnh so it is not used to DDOS you
20:35 bjonnh pdurbin: I corrected the typo, the site is rebuilding. Thanks
20:35 pdurbin sure
20:36 pdurbin pameyer: yeah, it would have to come from a file
20:37 pameyer any idea if you can specify a single line from a file?
20:37 bjonnh oh you mean the file would contain many smiles and you say "take the one on line X"
20:37 pameyer ... and I didn't even notice the typo (maybe because of how many I make)
20:37 pameyer that's what I was thinking
20:38 pameyer I was assuming a SMI file inside dataverse, linking to an external tool
20:38 bjonnh ressource/resource litterature/literature are always an issue with me
20:38 pameyer SMI files I've seen usually have one compound per line
20:38 bjonnh because in french we add letters…
20:38 bjonnh I can make it display anything, not just smiles
20:39 bjonnh that could be MOL, SDF…
20:39 bjonnh the library I use to handle the files is pretty versatile
20:39 bjonnh the good thing with the smiles is that I can cache it easily
20:39 pameyer more likely I missed some context while skimming the logs and misunderstood what you're thinking
20:41 bjonnh I was thinking about displaying structures in dataverse
20:41 bjonnh my initial idea was "when we have metadata SMILES, urlencode and put an <img src…"
20:41 bjonnh but I have no idea how doable that is
20:42 bjonnh (it could be a displayed_smiles or anything more specific as well)
20:42 pameyer think I've got it now
20:43 pameyer I don't know if the external tools stuff on the dataverse side is setup for doing things by metadata values
20:43 pdurbin Nope. Files only.
20:44 pameyer not even a dataset?
20:44 pameyer still do-able, but uglier
20:44 bjonnh can you direct me to an exemple for how these external tools are implemented?
20:44 pameyer external tool (datafile) -> native API (datafile -> dataset) -> native API (dataset -> metadata value)
20:45 pdurbin Sure, here's my tabular data that has a preview from an external tool: https://dataverse.harvard.edu/file.xhtml?persistentId=doi:10.7910/DVN/TJCLKP/3VSTKY&version=3.0
20:46 pdurbin There are actually two external tools for this file type. If you click "Access File" you'll see two tools: Data Explorer and View Data.
20:47 bjonnh oh so that's an iframe to an external page
20:47 pdurbin External tools need to know what MIME type to operate on. These tools work with tabular data.
20:47 bjonnh yeah in my case mime type is svg something like image/svg+xml
20:47 bjonnh oh you mean in the rep
20:48 bjonnh not sure smiles files have a mime type…
20:48 bjonnh oh they do: chemical/x-daylight-smiles
20:48 pdurbin Yeah. If you click "Explore on View Data" it'll open in a new window and you'll see the tool is actually hosted on GitHub Pages. It's just some HTML and Javascript.
20:48 bjonnh neat
20:48 bjonnh that's exactly what we will need for the NMR stuff
20:48 pameyer a surprising number of scientific data files have mime types - at least the old ones
20:49 bjonnh is there an official process for registering mimetypes?
20:49 pdurbin yep
20:49 pdurbin through Internet Assigned Numbers Authority (IANA)
20:53 bjonnh wow ok that's a complicated process
20:53 bjonnh I created an issue on our nmr viewer side
20:53 bjonnh see if we want to go that route
20:53 bjonnh we => they
20:54 pdurbin probably takes a while
20:55 pameyer if both sites cooperate, you could probably use an unregistered one
20:55 pdurbin I think my name still might be on an OID for SNMP for a previous employer. That's also through IANA.
20:56 bjonnh ohh SNMP I remember dealing with that when setting up my switch supervision
20:56 bjonnh (with prometheus and grafana)
20:58 pdurbin Yep, still there. 16789 at http://www.iana.org/assignments/enterprise-numbers/enterprise-numbers
20:59 pdurbin Any hoo. I gotta run. A pleasure chatting, as always. See you tomorrow.
20:59 pdurbin left #dataverse
21:00 pameyer enjoy the good weather pdurbin
21:08 pameyer bjonnh - it might also be worth double-checking that external tools stuff is happy with svg.  I'd imagine it should be, but I remember that trying to use svg for a dataverse logo didn't work.  that's a slightly different case (since those images get resized server-side)
21:08 pameyer there have been a few times I've tripped over my assumption that things on the web handling images will actually take svg
21:08 bjonnh I could add arguments to constrain size if needed
21:09 bjonnh but that's another caching issue
21:09 bjonnh yeah svg can be weird depending on how you constrain size
21:13 pameyer yeah - and svg viewports can be odd too
21:14 pameyer but there are also cases that don't accept svg as a valid image format (google docs is the one that sticks in my head most)
21:15 pameyer there's always svg -> png; but then you lose the vector graphics advantages because of tool limitations
21:22 bjonnh google sheets does
21:23 bjonnh I've integrated my service in it
21:23 bjonnh never tried in docs
21:23 bjonnh oh yeah docs doesn't like it
21:23 bjonnh interesting
21:24 bjonnh they suppors emf and not svg…
21:24 bjonnh support
21:24 bjonnh yes I have everything ready to add the png support, no need for conversion
21:24 pameyer huh - I didn't know google sheets took it
21:25 pameyer but I have the bad habit of trying to use svg in presentations
21:25 pameyer probably worried about javascript or external resources in svg
21:25 bjonnh yeah try =IMAGE("https://mol.nprod.net/molecule/smiles/CCCOCCC.svg")
21:25 bjonnh I was just wondering about turning the PNG output on, because that's bigger output, more work on the server…
21:26 bjonnh and I would like people to use vector graphics more
21:26 pameyer me too - hopefully I'm being over-cautions, and it'll just work
21:26 bjonnh in any case I can activate png mode in a couple of lines
21:27 pameyer =IMAGE("https://mol.nprod.net/molecule/smiles/CCCOCCC.svg") is very cool :)
21:29 bjonnh yes and it scales
21:29 bjonnh I have spreadsheets connected to Wikidata now
21:29 bjonnh so I can display the structures right there
21:30 bjonnh I had 1400 svgs on one sheet once, and it worked pretty well, despite the servers being thousands of km away
21:30 bjonnh curiously, it is not the webbrowser that gets the svg, it gets downloaded by google servers
21:30 bjonnh I wouldn't have expected that and it adds one more layer of caching
21:32 pameyer google's cache may be a good approximation of a backup copy of the internet
21:33 bjonnh I wish there was a working alternative to google shetts
21:34 bjonnh also I heard they are switching to canvas based rendering
21:34 bjonnh so it is going to be much snappier
21:34 bjonnh making it even harder to compete with
21:36 pameyer I still get cranky about web pages requiring javascript... but that argument was lost long ago

| Channels | #dataverse index | Today | | Search | Google Search | Plain-Text | plain, newest first | summary

Connect via chat.dataverse.org to discuss Dataverse (dataverse.org, an open source web application for sharing, citing, analyzing, and preserving research data) with users and developers.