In each of the
output directories is a directory called
untracked. As of client v1.3.0, these directories will be automatically created if they don't already exist. Anything written in one of the directories - from inside a running project, or on your host OS - is ignored when versioning, detecting changes, and syncing.
There are several use cases for untracked directories outlined below, and many more we haven't listed. We'd love to hear of any interesting uses for untracked directories or if you have any questions via our Spectrum Chat forum.
You can use untracked folders to store large files that are too big to upload due to size restrictions meant to guarantee a project doesn't get too slow to use. Note that currently you will be limited when uploading through the browser, but you can always copy data in on your host.
Sometimes you want to write out intermediate data that you don't necessarily want to version and keep around. This is a great reason to write to untracked folders in input or output.
Finally, you might have sensitive data that you do not want synced and shared. If you place your data in the untracked folders they will not be versioned or shared. Remember, your collaborators will need to obtain files some other way and place them in the same location for your code to work! Also, you may want to read the Including Sensitive Information section that outlines a few more ways to manage sensitive data.
In some cases you may have your data on a network share or a large external drive. While there are multiple ways to get access to such data inside your project, a general approach that works on macOS or Linux is making a mount on the host in an untracked folder, for example, you could do one of the following from your project directory (projects are located in
~/gigantum/your-username/project-owner/labbooks/project-name). Note that bind-mounts are unfortunately only readily available on Linux:
cd ~/gigantum/your-username/project-owner/labbooks/project-name mkdir input/untracked/my-huge-dataset mount -o bind /mnt/my-huge-external-dataset input/untracked/my-huge-dataset
cd ~/gigantum/your-username/project-owner/labbooks/project-name mkdir input/untracked/my-huge-dataset mount -t nfs localhost:/my-huge-dataset input/untracked/my-huge-dataset
You could likewise create a mount in
output/untracked if you were going to generate a large amount of output data. Once you've created such a link, the next time you launch your project, you'll be able to access those files. If you need to do something like this on Windows or want to discuss other alternatives, please drop us a line on Spectrum and we'll help you out! There are lots of different ways this could work which are not documented here.
For example, another option is to mount an entire volume into your untracked folder. This may be something you need to do in macOS, again depending on your setup. In this example, a USB drive is mounted into the
input/untracked folder of a project.
sudo mount -t msdos /dev/disk2s1 output/untracked/some-dir
In any case, we recommend documenting what you did inside your README. Note that you will need to manage synchronizing data between machines, and the mounts will need to be created manually in each location that you use it, and after every reboot (or you can use
/etc/fstab - but that's beyond the scope of this article).
Another use-case for untracked folders is if you want to checkout code from a git repository to use in your Project. Our automated versioning doesn't currently play nice with embedded Git repositories if you aren't careful. Additional git repositories must be placed in untracked directories or manually added to the Project's
You can easily add a notebook inside your respository (e.g.
00-setup.Rmd) that includes a command like
git clone https://github.com/me/my-repo untracked/my-repo. You can explain in a comment that this only needs to be done once. Depending on your audience, you might also include content in other notebooks that checks for the existence of that directory, additional comments, etc.
Updated over 1 year ago