πŸ€” Problem

We have a grant from the ETH ORD committee to integrate Renku with data repositories, specifically EnviDAT (WSL) and SciCAT (PSI). We want people to be able to enter DOIs for datasets in these data repositories into Renku and connect data from these data repositories.

🍴 Appetite

6 weeks

🎯 Solution

UX

In terms of UX, this should work exactly like the Zenodo integration. The user enters a DOI, and gets a Global Data Connector.

Technical Solution

Rclone has to work with the responses from the envidat API. The envidat API has a resolver that will resolve a doi to json-ld. But our changes to rclone (upstream) work with an endpoint that resolves a doi with a json response. So we have 2 options:

  1. [preferred option] Make the doi option in rclone work with jsonld (in addition to json)

    1. Pros: More consistent, EnviDAT is usable via the RClone CLI
    2. Cons: Might be slower to make an upstream contribution to RClone, have to run off of a fork for a while again

    NOTE: Add this first to our fork of rclone

  2. [backup option] Parse the json-ld response from envidat on the renku side and then launch a session as usual (in this case csi-rclone will use a S3 source, not a doi source, i.e. the rclone cli will not work with envidat data sources).

    1. Pros: Faster
    2. Cons: Uglier, maybe harder to maintain?

🐰 Rabbit Holes

We are expecting that SciCAT’s implementation will follow. We’d like set ourselves up for a situation where adding SciCAT after this is minimal effort.

πŸ” Security Implications

πŸ™…β€β™€οΈ No-gos