Gigantum

The Gigantum User Hub

Welcome to the Gigantum user hub. You'll find comprehensive guides and documentation to help you start working with the Gigantum platform as quickly as possible. If you get stuck, then there is help for that too.

Let's get to it!

Get Started    

Datasets Overview

A quick summary of Gigantum Datasets

What is a Dataset?

A Gigantum Dataset is a repository, similar to a Project, for the sole purpose of managing data. While Datasets share many features and UI elements with Projects, they are fundamentally different in that only metadata is embedded in the Dataset, instead of all the actual contents like with a Project. This provides many benefits, including fast sync times, partial downloads, and the ability to de-duplicate files across versions and Projects.

With Datasets, the actual management and storage of data during syncing is delegated to the Dataset's storage backend, which can easily be modified to integrate with various existing services. This design allows Datasets to be flexible and act as an integrator for many different methods and places that data is stored, all through a single user interface that continues to "level the playing field" when it comes to skills required to perform data analysis.

An example Dataset

An example Dataset

Currently, the only Dataset backend is provided by Gigantum Cloud, but in the future, additional Dataset types will be available. The Gigantum Cloud Dataset type supports individual files up to 15GB in size and can handle many files in a single Dataset.

How Do You Use a Dataset?

Datasets enable the independent management of data, which may be useful for publishing data along, but in the end of the day you want to work with a Dataset. To do this, you must "link" a Dataset to a Project. When you link a Dataset, you are creating a reference in the Project to a specific version of the Dataset. Currently you can only link to the latest version of the Dataset.

Once a Dataset is linked, any files in that Dataset that exist locally will be mounted into the Project container at runtime. The files will appear in the input directory as a folder with the same name as the Dataset. Also, note that files will be read-only, as they can be mounted into any Project to which the Dataset is currently linked.

To update files in a Dataset, you must update the Dataset itself and then update the reference to the Dataset in the desired Project.

You can learn more about working with Datasets here.

Datasets Overview


A quick summary of Gigantum Datasets

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.