Gigantum is a new approach to data science that helps users to work wherever they want but still have a managed experience. It automates user level tasks for tools like Git and Docker, and it makes skilled data scientists faster and new data scientists more skilled.
The only dependency is Docker, which means Gigantum can be used on just about any compute resource, from laptops and workstations to cloud instances and on-premise servers.
In addition to a managed work experience, Gigantum provides a user friendly, decentralized system for sharing work across machines and people with a single click.
Gigantum is broken into two major components in a “client” + “server" model. This is is analogous to Git + GitHub but with a much higher degree of local automation and different concerns.
The core is the open source Gigantum Client, which is a web application that can be run anywhere to manage the development, execution, versioning and containerization of work primarily in Python and R.
The Client's UI integrates with browser based data science environments like Jupyter (Classic and Lab) and RStudio.
The Client makes sure that everything done in Gigantum is reproducible, transparent and portable from the moment you start working and can be shared with the click of a button.
You can install it locally from our download page. Docker is required, but the installer will walk you through this process if needed.
Gigantum Hub is a service to backup and share your work between different machines and people. It provides storage, sharing, and collaboration features, as well as access to a Gigantum Client running ephemerally in the cloud for computation from anywhere.
Gigantum Hub is available as a SaaS offering for storage and limited compute for convenience. Free accounts come with 5GB of storage and 5 hours of compute per month. Standard and Pro tier accounts are available if you need more storage or compute in Gigantum Hub (remember, it's always free to run the Client and compute wherever you'd like). More information is available on our pricing page.
Gigantum organizes work into Projects, which are comprehensive repositories for code, data, environment configuration and history. Users don't need to know Git or Docker to work with them because Projects are managed for you by the Client.
Gigantum also has a separate repository type to managed data independently from a Project called a Dataset. Datasets are useful to organizes and version files in cases where the number or size of files is large, causing Git to slow down significantly, or you want to reuse data across Projects. Datasets are also managed by the Client and you can read more about them in the Datasets section.
Updated 20 days ago