The paper "In defense of decentralized research data management" (dRDM) reviews how requirements different scientific stakeholders can be addressed by a scalable and uniform dRDM solution, such as DataLad, which is used as an infrastructure component or utility in a growing number of services and software packages:

DataLad and its datasets are compatible with a number of existing portals and storage solutions:

  • DataLad datasets can be hosted on GIN, and any data hosted on GIN can be accessed via DataLad. Moreover, the GIN service is also available for local deployment, offering a convenient, in-house storage back-end and web UI for DataLad datasets.
  • With the datalad-osf extension package, DataLad datasets, with all data file content and version history, can be hosted on the Open Science Framework (example study).
  • The datalad-ukbiobank extension package represents UK Biobank data as extensible DataLad datasets that can monitor future updates.
  • DataLad datasets can be hosted (without git-annex'ed data) on any Git hosting portal such as GitHub and GitLab.
  • DataLad datasets can be "linked" to a wide range of data hosting portals (e.g., AWS S3 and Glacier, and many others) to offload data online.