@ekansa @paregorios this all sounds like what I would do myself: 1) move away from GitHub - single point of failure and not very good for "big data" 2) leverage the Zenodo API with versioning 3) one dataset = one archive file (with a descriptor? e.g. a datapackage.json or similar metadata)