GuidesDiscussionChangelogLog In

Creating Your First Gigantum Project

If this is your first time working in Gigantum, you should read this whole page.

Create a Project

Gigantum starts in the Projects listing page.

  1. To create a new Project, click on the Add Project card
  • Enter a title of lowercase alphanumeric strings, including hyphens. No spaces or capitals!
  • Put in a description. This is a short record of why you made the Project.
Create your first Project by choosing a title and description.Create your first Project by choosing a title and description.

Create your first Project by choosing a title and description.

  1. Next pick a base for your Project.

A base is a pre-built Docker image that includes the operating system, development tools (e.g. Jupyter or RStudio), and optionally a preconfigured set of packages. It is the starting point for your work environment.

  • For this simple walk-through, click the python3 tab and choose the Select button on the Python3 Minimal card. The Python3 Minimal base has no pre-installed packages so it should download faster.
  • Click Create Project to create the new Project.

Another available base is Python3 Data Science Quick-Start which has a pre-installed set of data science packages like numpy, scipy, scikit-learn and matplotlib. This will take longer to download.

Once creation of the Project is complete, the app takes you to the Project Overview. The Overview is a quick look at the Project and includes things like recent Activity, a summary of the Environment configuration, and Favorite files.

At the top right of the Project Overview is the container status widget that also lets you run or stop the Project.At the top right of the Project Overview is the container status widget that also lets you run or stop the Project.

At the top right of the Project Overview is the container status widget that also lets you run or stop the Project.

  1. Start JupyterLab.
  • In the upper right corner is a status widget for the Project state. It is both a state indicator (e.g. stopped, building, running) and a button to change states.
  • For a new Project, the initial status is Stopped.
  • Mouse over the button and click on Run to start the Project. The status will transition to Starting, and then to Running.
  • Click on the Launch: jupyterlab button to open JupyterLab in a new browser tab. Just make sure the popup doesn't get blocked!

You now have two open browser tabs, one with Gigantum and one with JupyterLab.

Use the Project Run Widget to control the Project state.Use the Project Run Widget to control the Project state.

Use the Project Run Widget to control the Project state.

Print Hello, World!

JupyterLab is now running! Let's start with the basics.

When JupyterLab is first opened you will be dropped into the code directory. This is where your Jupyter notebooks should be stored. If you click on the home button , you will see the code directory as well as two other directories, input and output. The input directory should be where all of your input data is placed. The output data is where you should put your programs output.

If you don't like working with relative paths, the absolute path to these directories is always available when working in a Project from the environment variables LB_CODE, LB_INPUT, and LB_OUTPUT.

Use these directories to help you, collaborators, and the world know where to look for your stuff when opening your Project.

  1. Create a notebook by clicking Python3 in the Notebook section of the Launcher pane.
  2. Enter print("hello world!") in the first cell and run it.
  3. Switch back to your browser tab that has Gigantum running
  4. Go to the Activity Feed


Save your work!

Gigantum tracks everything you do and keeps a record of every execution, but you still have to save your work to write to disk. Make sure to use the JupyterLab interface to save changes to your notebooks as you normally would.

Congrats! You just created your first activity entry in Gigantum.Congrats! You just created your first activity entry in Gigantum.

Congrats! You just created your first activity entry in Gigantum.

Environment Changes

We know that printing "hello world!" is pretty exciting, but you will typically need additional packages installed to do your work. Luckily, in Gigantum you can change the environment with package managers like pip, conda, and apt.

Let's add the latest versions of numpy and matplotlib as a demonstration:

  1. Go to the Project in your browser.
  2. Use the status widget to Stop your Project.
  3. Navigate to the Project "Environment" section.
  4. Click on the pip tab and then expand the "Add Packages" Section.
  5. Enter the packages to install (e.g. numpy, matplotlib).
  6. Click Install Selected Packages.
  7. Wait for the Project state to go from "Building" to "Stopped".
  8. Start the Project with the status widget and open up JupyterLab again.
Use "Add Packages" section to add numpy and matplotlibUse "Add Packages" section to add numpy and matplotlib

Use "Add Packages" section to add numpy and matplotlib

There are a few things to keep in mind when adding packages with Gigantum.

  • You must Stop the Project before changing packages!
  • You can specify a specific version of the package or leave it blank. If the version is omitted, the latest version of the package will be installed.
  • Gigantum always "pins" versions when possible to help provide reproducibility, but
    note that apt managed packages currently are not pinned.
  • conda can take a long time to return from queries and install commands, so be patient when using it.

Finally, you can also install things via a "Custom Docker snippet" for things that aren't available in package managers. Simply enter valid dockerfile instructions and Gigantum will include them in the build process. Currently only RUN, ENV and comments are supported.


Containers (for the curious)

A Project is a repository of code, data, and environment configuration, all enhanced with a searchable activity feed. The development tools are launched in a Docker container built from a stored environment configuration, and Gigantum makes Project managed data available to the container.

You can't change the environment configuration of a Running Project. To reconfigure the environment or available tools, you must first Stop the Project. This is because the Docker container is rebuilt upon adding or removing packages, and this can't be done when the container is running.

All other Gigantum features work at all times. For example, you can add and remove data from Running or Stopped Projects.

The Activity Feed

When you Run a Project and work in JupyterLab, Gigantum tracks your activity and builds a rich, searchable history based on the Jupyter notebooks you use. Everything is in the Activity Feed, which you can find by clicking on the Activity tab in the Project Overview.

Let's add some code to a Jupyter notebook to see the Activity Feed in action. We can also verify that the environment changes we just made worked.

Run your Project and open Jupyter as before. The snippet below is a simplified version of a matplotlib demo.

import matplotlib.pyplot as plt
from matplotlib.colors import BoundaryNorm
from matplotlib.ticker import MaxNLocator
import numpy as np

# generate data
dx, dy = 0.05, 0.05
y, x = np.mgrid[slice(1, 5 + dy, dy), slice(1, 5 + dx, dx)]
z = np.sin(x)**10 + np.cos(10 + y*x) * np.cos(x)
z = z[:-1, :-1]
levels = MaxNLocator(nbins=15).tick_values(z.min(), z.max())

# set colormap and normalize
cmap = plt.get_cmap('PiYG')
norm = BoundaryNorm(levels, ncolors=cmap.N, clip=True)
fig, (ax0) = plt.subplots(nrows=1)

# configure contours and output
cf = ax0.contourf(x[:-1, :-1] + dx/2.,
                  y[:-1, :-1] + dy/2., z, levels=levels,
fig.colorbar(cf, ax=ax0)
ax0.set_title('contourf with levels')

Paste it into a cell in your Jupyter Notebook, click save, and execute the cell

The executed example code in a Jupyter notebookThe executed example code in a Jupyter notebook

The executed example code in a Jupyter notebook

Once the cell has successfully executed, Gigantum will automatically extract useful information (e.g. the file that has changed, code executed, and figure created) and create a new version of your Project.

If you navigate back to the browser tab running Gigantum and select the Activity tab, you'll see a new Activity Record summarizing what was done. Here, Gigantum has captured the output, linked with the code and environment that generated it.

Example Activity RecordExample Activity Record

Example Activity Record

Each record in the activity feed is a version that can be reproduced, modified, or shared. A user can go back to any point in time to recreate their previous work and use previous results as a starting point for new data explorations!

When you are done working on your Project, click "Stop" to shut off the container and clean up resources.


Safest to Stop!

You only need to stop the Gigantum Client if you are shutting your computer down or no longer want to use the client. Putting your computer to sleep won't interfere with Gigantum, although there have been some issues reported with Docker for Windows when entering sleep mode!