Here we show you how to get started and a few simple features work.
Gigantum starts in the
Projects listing page.
- To create a new Project, click on the
- Enter a title of lowercase alphanumeric strings, including hyphens. No spaces or capitals!
- Put in a description. This is a short record of why you made the Project.
Create your first Project by choosing a title and description.
- Next pick a base for your Project.
A base is a pre-built Docker image that includes the operating system, development tools, and optionally a preconfigured set of packages. It is the starting point for your work environment.
- For this simple walk-through, click the
python3tab and choose the
Selectbutton on the
Python3 Minimalcard. The
Python3 Minimalbase has no pre-installed packages so it should download faster.
Create Projectto create the new Project.
Another available base is
Python3 Data Science Quick-Start which has a pre-installed set of data science packages like numpy, scipy, scikit-learn and matplotlib. This will take longer to download.
Once creation of the Project is complete, the app takes you to the Project
Overview is a quick look at the Project and includes things like recent Activity, a summary of the Environment configuration, and Favorite files.
At the top right of the Project Overview is the container status widget that also lets you run or stop the Project.
- Start JupyterLab.
- In the upper right corner is a status widget for the Project state. It is both a state indicator (e.g. stopped, building, running) and a button to change states.
- For a new Project, the initial status is
- Mouse over the button and click on
Runto start the Project. The status will transition to
Starting, and then to
- Click on the
Launch: jupyterlabbutton to open JupyterLab in a new browser tab. Just make sure the popup doesn't get blocked!
You now have two open browser tabs, one with Gigantum and one with JupyterLab.
Use the Project Run Widget to control the Project state.
JupyterLab is now running! Let's start with the basics.
When JupyterLab is first opened you will be dropped into the
code directory. This is where your Jupyter notebooks should be stored. If you click on the home button , you will see the
code directory as well as two other directories,
input directory should be where all of your input data is placed. The
output data is where you should put your programs output.
If you don't like working with relative paths, the absolute path to these directories is always available when working in a Project from the environment variables
Use these directories to help you, collaborators, and the world know where to look for your stuff when opening your Project.
- Create a notebook by clicking
Notebooksection of the
print("hello, world!")in the first cell and run it.
- Switch back to your browser tab that has Gigantum running
- Go to the Activity Feed
Save your work!
Gigantum tracks everything you do and keeps a record of every execution, but you still have to save your work to write to disk. Make sure to use the JupyterLab interface to save changes to your notebooks as you normally would.
Congrats! You just created your first activity entry in Gigantum.
We know that printing "hello, world" is pretty exciting, but you will typically need additional packages installed to do your work. Luckily, in Gigantum you can change the environment with package managers like pip, conda, and apt.
Let's add the latest versions of
matplotlib as a demonstration:
- Go to the Project in your browser.
- Use the status widget to
- Navigate to the Project "Environment" section.
- Click on the
piptab and then expand the "Add Packages" Section.
- Enter the packages to install (e.g. numpy, matplotlib).
Install Selected Packages.
- Wait for the Project state to go from "Building" to "Stopped".
- Start the Project with the status widget and open up JupyterLab again.
Use "Add Packages" section to add numpy and matplotlib
There are a few things to keep in mind when adding packages with Gigantum.
- You must
Stopthe Project before changing packages!
- You can specify a specific version of the package or leave it blank. If the version is omitted, the latest version of the package will be installed.
- Gigantum always "pins" versions when possible to help provide reproducibility, but
aptmanaged packages currently are not pinned.
condacan take a long time to return from queries and install commands, so be patient when using it.
Finally, you can also install things via a "Custom Docker snippet" for things that aren't available in package managers. Simply enter valid dockerfile instructions and Gigantum will include them in the build process. Currently only
ENV and comments are supported.
Containers (for the curious)
A Project is a repository of code, data, and environment configuration, all enhanced with a searchable activity feed. The development tools are launched in a Docker container built from a stored environment configuration, and Gigantum makes Project managed data available to the container.
You can't change the environment configuration of a
Running Project. To reconfigure the environment or available tools, you must first
Stop the Project. This is because the Docker container is rebuilt upon adding or removing packages, and this can't be done when the container is running.
All other Gigantum features work at all times. For example, you can add and remove data from
Run a Project and work in JupyterLab, Gigantum tracks your activity and builds a rich, searchable history based on the Jupyter notebooks you use. Everything is in the Activity Feed, which you can find by clicking on the Activity tab in the Project Overview.
Let's add some code to a Jupyter notebook to see the Activity Feed in action. We can also verify that the environment changes we just made worked.
Run your Project and open Jupyter as before. The snippet below is a simplified version of a matplotlib demo.
import matplotlib.pyplot as plt from matplotlib.colors import BoundaryNorm from matplotlib.ticker import MaxNLocator import numpy as np # generate data dx, dy = 0.05, 0.05 y, x = np.mgrid[slice(1, 5 + dy, dy), slice(1, 5 + dx, dx)] z = np.sin(x)**10 + np.cos(10 + y*x) * np.cos(x) z = z[:-1, :-1] levels = MaxNLocator(nbins=15).tick_values(z.min(), z.max()) # set colormap and normalize cmap = plt.get_cmap('PiYG') norm = BoundaryNorm(levels, ncolors=cmap.N, clip=True) fig, (ax0) = plt.subplots(nrows=1) # configure contours and output cf = ax0.contourf(x[:-1, :-1] + dx/2., y[:-1, :-1] + dy/2., z, levels=levels, cmap=cmap) fig.colorbar(cf, ax=ax0) ax0.set_title('contourf with levels') fig.tight_layout() plt.savefig("../output/pcolormesh")
Paste it into a cell in your Jupyter Notebook, click save, and execute the cell
The executed example code in a Jupyter notebook
Once the cell has successfully executed, Gigantum will automatically extract useful information (e.g. the file that has changed, code executed, and figure created) and create a new version of your Project.
If you navigate back to the browser tab running Gigantum and select the Activity tab, you'll see a new Activity Record summarizing what was done. Here, Gigantum has captured the output, linked with the code and environment that generated it.
Example Activity Record
Each record in the activity feed is a version that can be reproduced, modified, or shared. A user can go back to any point in time to recreate their previous work and use previous results as a starting point for new data explorations!
When you are done working on your Project, click "Stop" to shut off the container and clean up resources.
Safest to Stop!
You only need to stop the Gigantum Client if you are shutting your computer down or no longer want to use the client. Putting your computer to sleep won't interfere with Gigantum, although there have been some issues reported with Docker for Windows when entering sleep mode!
Sometimes you don't want to keep data in a Project because it is sensitive, too big, etc. If that is the case, then you just need to create an "untracked" Project. If you click the "Input/Output Version Tracking Enabled" switch when first creating a Project, the
output sections of the Project will not be tracked or synced with Gigantum Cloud. That way the data won't ship when you share the Project.