DAMONA¶
- Python version:
Python 3.8, 3.9, 3.10
- Source:
- Issues:
Please fill a report on github
- Platform:
This is currently only available for Linux distribution with zsh/fish/bash shells (contributions are welcome to port the tool on other platforms/shells)
Quick Start¶
Assuming Singularity (Apptainer) is installed on your system, install Damona using pip for Python:
pip install damona
You need to configure Damona before using it. In a bash shell, type:
damona
Add these lines in your .bashrc:
if [ ! -f "~/.config/damona/damona.sh" ] ; then
source ~/.config/damona/damona.sh
fi
open a new shell and then use damona:
damona create TEST
damona activate TEST
damona install fastqc
fastqc
This should install fastqc in the newly created Damona environment (TEST). Type:
fastqc
it should open a window with the fastqc interface.
Overview¶
Damona is a singularity environment manager.
Damona started as a small collections of singularity recipes to help installing third-party tools for Sequana NGS pipelines.
Damona is now used in production to create reproducible environments where singularity images and their associated binaries are installed altogether.
In a nutshell, Damona combines the logic of Conda environments with the reproducibility of singularity containers. We believe that it could be useful for other projects and therefore decided to release it as an independent tool.
As of 11st Jna 2024, Damona contains 82 software, 128 releases 456 binaries.
Installation¶
If you are in a hurry, just type:
pip install damona --upgrade
You must install Singularity to make use of Damona.
If you are familiar with conda, I believe you can do:
conda install singularity
Type damona in a shell. This will initiate the tool with a config file in your HOME/.config/damona directory for bash shell and fish shell users.
Bash users should add this code in their ~/.bashrc file:
source ~/.config/damona/damona.sh
Fish shell users should add the following code in their ~/.config/fish/config.fish file:
source ~/.config/damona/damona.fish
Zsh users should add the following code in their ~/.config/fish/config.fish file:
source ~/.config/damona/damona.zsh
Open a new shell and you are ready to go. Please see the Installation in details section for more information.
Quick Start¶
Damona needs environments to work with. First, let us create one, which is called TEST:
damona create TEST
Second, we need to activate it. Subsequent insallation will happen in this environment:
damona activate TEST
From there, we can install some binaries/images:
damona install fastqc:0.11.9
That's it. Time to test. Type fastqc.
To rename this TEST example:
damona rename TEST --new-name prod
or delete it:
damona delete prod
See more examples hereafter or in the user guide on https://damona.readthedocs.io
Motivation¶
As stated on their website, Conda is an open source package management system and environment management system. Conda provides pre-compiled releases of software; they can be installed in different local environments that do not interfer with your system. This has great advantages for developers. For example, you can install a pre-compiled libraries in a minute instead of trying to compile it yourself including all dependencies. Different communities have emerged using this framework. One of them is Bioconda, which is dedicated to bioinformatics.
Another great tool that emerged in the last years is Singularity. Singularity containers can be used to package entire scientific workflows, software and libraries, and even data. It is a simple file that can be shared between environments and guarantee exectution and reproducibility.
Originally, Conda provided pre-compiled version of a software. Nowadays, it also provides a docker and a singularity image of the tool. On the other side, Singularity can include an entire conda environment. As you can see everything is there to build reproducible tools and environment.
Now, what about a software in development that depends on third-party packages ? You would create a conda environment and starts installing the required packages. Quickly, you will install another package that will break your environment due to unresolved conlicts; this is not common but it happens. In the worst case scenario, the environment is broken. In facilities where users depends on you, it can be quite stresful and time-consuming to maintain several such environments. This is why we have moved little by little to a very light conda environment where known-to-cause-problem packages have been shipped into singularity containers. This means we have to create aliases to those singularities. The singularities can be simple executable containers or full environment containers with many executables inside. In both cases, one need to manage those containers for different users, pipelines, versions etc. This started to be cumbersome to have containers in different places and update script that generate the aliases to those executables.
That's where damona started: we wanted to combine the conda-like environment framework to manage our singularity containers more easily.
Although Damona was started with the Sequana projet, Damona may be useful for others developers who wish to have a quick and easy solution for their users when they need to install third-party libraries.
Before showing real-case examples, let us install the software itself and understand the details.
Installation in details¶
The is the egg and chicken paradox. To get reproducible container with singularity, at some point you need to install singularity itself. That the first of the two software that you will need to install. Instructions are on singularity web site. This is not obvious to be honest. You need the GO language to be installed as well. I personally installed from source and it worked like a charm.
Second, you need Damona. This is a pure Python sotfware with only a few dependencies. Install it with the pip software provided with your Python installation (Python 3.X):
pip install damona --upgrade
Type damona to create the Damona tree structure. Images and binaries will be saved in your home directory within the ~/.config/damona directory. There, special files should be available: damona.sh, damona.fish and damona.cfg. Check that those files are present.
Finally, you need to tell your system where to find damona. For bashrc users, please add this line to you bashrc file:
source ~/.config/damona/damona.sh
open a new shell and type damona and you should be ready to go.
For fishshell users, please add this line in ~/.config/fish/config.fish*:
source ~/.config/damona/damona.fish
Tutorial¶
The Damona standalone is called damona. It has a documentation that should suffice for most users.
The main documentation is obtained using:
damona --help
where you will see the list of Damona commands (may be different with time) (may be:
activate Activate a damona environment.
clean Remove orphan images and binaries from all environments.
create Create a new environment
deactivate Deactivate the current Damona environment.
delete Remove an environment
env List all environemnts with some stats.
export Create a bundle of a given environment.
info Print information about a given environment.
install Download and install an image and its binaries.
list List all packages that can be installed
remove Remove binaries or image from an environment.
rename Rename an existing environment
search Search for a container or binary.
stats Get information about Damona images and binaries
To get help for the install command, type:
damona install --help
1. list available environments¶
By default you have an environment called base. Unlike the base environment found in conda, it is not essential and may be altered. However, it cannot be removed or created. You can check the list of environments using:
damona env
2. create environments¶
All environments are stored in ~/.config/damona/envs/. You can create a new one as follows:
damona create TEST
There, you have a bin directory where binaries are going to be installed.
You can check that it has been created:
damona env
Note the last line telling you that:
Your current env is 'TEST'.
3. activate and deactivate environments¶
In order to install new binaries or software package, you must activate an environment. You may activate several but the last one is the active one. Let us activate the TEST environment:
damona activate TEST
Check that it is active using:
damona env
and look at the last line. It should look like:
Your current env is 'TEST'.
What is going on when you activate an environment called TEST ? Simple: we append the directory ~/.config/damona/envs/TEST/bin to your PATH where binaries are searched for. This directory is removed when you use the deactivate command.
damona deactivate TEST
damona env
should remove the TEST environment from your PATH. You may activate several and deactivate them. In such case, the environments behave as a Last In First Out principle:
damona activate base
damona activate TEST
damona deactivate
Removes the last activated environments. While this set of commands is more specific:
damona activate base
damona activate TEST
damona deactivate base
and keep the TEST environment only in your PATH.
4. install a software¶
Let us now consider that the TEST environment is active.
Damona provides software that may have several releases. Each software/release comes with binaries that will be installed together with the underlying singularity image.:
damona install fastqc:0.11.9
Here, the singularity image corresponding to the release 0.11.9 of the fastqc software is downloaded. Then, binaries registered in this release are installed (here the fastqc binary only).
All images are stored in ~/.config/damona/images and are shared between environments.
5. Get info about installed images and binaries¶
You can get the binaries installed in an environment (and the images used by them) using the info command:
damona info TEST
6. Search the registry¶
You can search for a binary using:
damona search PATTERN
External registry can be set-up. For instance, a damona registry is accessible as follows (for demonstration):
damona search fastqc --url damona
Where damona is an alias defined in the .config/damona/damona.cfg that is set to https://biomics.pasteur.fr/drylab/damona/registry.txt
You may retrieve images from a website where a registry exists (see the developer guide to create a registry yourself).
7. combine two different environments¶
In damona, you can have sereral environments in parallel and later activate the one you wish to use. Let us create a new one:
damone create test1
and check that you now have one more environment:
damona env
We want to create an alias to the previously downloaded image of fastqc tool but in the test1 environment. First we activate the newly create environment:
damona activate test1
then, we install the container:
damona install fastqc:0.11.9
This will not download the image again. It will just create a binary in the ~/.config/damona/envs/test1/bin directory.
you can combine this new environment with the base one:
damona activate base
If you are interested to know more, please see the User Guide and Developer guide here below.
Changelog¶
From version 0.10 onwards, we will not mention the new software and their versions but only changes made to the code itself.
Version |
Description |
---|---|
0.12.0 |
|
0.11.1 |
|
0.11.0 |
|
0.10.1 |
|
0.10.0 |
|
0.9.1 |
|
0.9.0 |
|
0.8.4 |
|
0.8.3 |
|
0.8.2 |
|
0.8.1 |
|
0.8.0 |
|
0.7.1 |
|
0.7.0 |
|
0.6.0 |
|
0.5.3 |
|
0.5.2 |
|
0.5.1 |
|
0.5.0 |
|
0.4.3 |
|
0.4.2 |
|
0.4.1 |
|
0.4.0 |
|
0.3.X |
|
0.3.0 |
|
0.2.3 |
|
0.2.2 |
|
0.2.1 |
fixed manifest |
0.2.0 |
first working version of damona to pull image locally with binaries |
0.1.1 |
small update to fix RTD, travis, coveralls |
0.1 |
first release to test feasibility of the project |
User guide and reference¶
User Guide¶
Getting help¶
The Damona standalone is called damona. It has a documentation that should suffice for most users.
The main documentation is obtained using:
damona --help
where you will see the list of Damona commands (may be different with time) (may be:
activate
clean
deactivate
env
export
info
install
list
remove
search
stats
To get help for the install command, type:
damona install --help
Environments¶
Damona provides a way to manage environments where Singularity images and binaries are installed. Environments are independent from each other. We decided to go for a very simple design where an environment is nothing else than a physical directory with a subdirectory called bin/ to store the binaries. All images are shared between environments to decrease the storage needs.
list environments¶
If you type:
damona env
You will get the list of environments available on your system. In theory, if you start from scratch there is only one called base that cannot be deleted or created. You can use it as a sandbox though where software can be installed or removed.
Create environments¶
All environments are stored in ~/.config/damona/envs/. You can create a new one as follows:
damona env --create TEST
There, you have a bin directory where binaries are going to be installed.
You can check that it has been created:
damona env
Note the last line telling you that:
Your current env is 'TEST'.
Activate/Deactivate environments¶
In order to install new binaries or software packages, you must activate an environment. You may activate several but the last one is the active one. Let us activate the TEST environment:
damona activate TEST
Check that it is active using:
damona env
and look at the last line. It should look like:
Your current env is 'TEST'.
What is going on when you activate an environment called TEST ? Simple: we append the directory ~/.config/damona/envs/TEST/bin to your PATH where binaries are searched for. This directory is removed when you use the deactivate command.
damona deactivate TEST
damona env
should remove the TEST environment from your PATH. You may activate several and deactivate them. In such case, the environments behave as a Last In First Out principle:
damona activate base
damona activate TEST
damona deactivate
Removes the last activated environments. While this set of commands is more specific:
damona activate base
damona activate TEST
damona deactivate base
and keep the TEST environment only in your PATH.
Software and releases¶
Search for existing software¶
Damona itself contains metadata to download containers and installed software. As explained in the motivation, other projects provide thousands of containers but here we provide containers for testing and proof of concept.
By default, Damona uses recipes, which can be found in the https://github.com/damona/damona/recipes directory. In the registry files (see later for details), we define the URL where images can be downloaded. Some are on https://cloud.sylabs.io/library/cokelaer collection, which is limited to 10Gb and therefore will not provide many containers. Others are on external registry and one can define its own registry for its projects.
To get a list of the available containers in Damona, type:
damona search "*" --images-only
You should see the container names and their version. You should also see where the file is going to be downloaded from.
You can search for specific pattern using:
damona search fastqc
This is not a lot indeed. So, we provide a system where you can look for containers elsewhere on internet. For now, there is only one registry available on https://biomics.pasteur.fr/salsa/damona (again for demonstration). There, we posted some containers and a registry.txt file; if you type:
damona search "*" --url https://biomics.pasteur.fr/salsa/damona/registry.txt
you will get a list of the images that are available. Anybody can provide a container on any website with a registry.txt and you will be able to access to the images.
The latter command can be simplified into
damona search "*" --url damona
This is possible by defining alias in the configuration file (in ~/config/damona.cfg as explained in the developer guide)
Download and install an image¶
The first thing to do before installing is software is to activate the environment where you wish to install the software:
damona env
tells you which is currently active. Otherwise activate one:
damona activate TEST
See above for more details.
Given the container name and version, you can now download a container image as follows:
damona install fastqc:0.11.9
If there are several version and you just want the latest, remove the tag:
damona install fastqc
That's it, you should get the image in your config path ~/.config/damona/images directory. In addition, a binary alias is created in ~/.config/damona/bin
And the fastqc command should be available:
fastqc
Note
using the activate command above, your PATH has been changed in your current shell. If you open a new shell, you will need to activate the environment again.
To install an image/binary, you can also use an external registry (see developer guide to define your own registry):
damona install fastqc:0.11.9 --url https://biomics.pasteur.fr/drylab/damona/registry.txt
For this particular website, we have an alias:
damona install fastqc:0.11.9 --url damona
You can add aliases in ~/.config/damona/damona.cfg file.
Application: set several Environments¶
In damona, environments are stored in ~/.config/damona. There, you have two sub-directories:
envs
images
In the images directory, we store the singularity containers. In envs directory, we store the environments. There, a sub-directory bin/ can be found. That is where we create aliases so as to make the container executables.
Now what about having different environments ? It would be nice to handle several pipelines in their own environments.
We could quickly test two different versions of a tools and test their impact on an analysis.:
damona env --create test1
damona env --create test2
Now, you need to activate the first one:
damona activate test1
and install a tool with a given version in this environment:
damona install fastqc:0.11.9
And to install it in the test2 environment:
damona deactivate
damona activate test2
damona install fastqc:0.11.8 --url damona
You can activate as many environments as you wish. Calling deactivate will only deactivate the last activated environment. In works as a Last In First Out mechanism.
Environmental variables¶
DAMONA_SINGULARITY_OPTIONS¶
All binaries created with Damona use this syntax:
singularity -s exec ${DAMONA_SINGULARITY_OPTIONS} ${DAMONA_PATH}/images/<IMAGE> <EXE> ${1+"$@"}
where EXE is the name of the executable binary, IMAGE the name of the container. Then, you can see two environmental variables.
The DAMONA_SINGULARITY_OPTIONS can be used to provide any required options to singularity. If undefined, it is set to an empty string. Otherwise, you can defined it as follows:
export DAMONA_SINGULARITY_OPTIONS="whatever_you_need"
Note anout display and the -e option.
It is usually good practive to set the -e option to not use the environement where you start the container. However, you may have issue with X11 display. Indeed, -e means do not use any environment variable. Therefore the DISPLAY is unset. If such case, you can use:
export DAMONA_SINGULARITY_OPTIONS=" -e --env DISPLAY=:1"
Example: Binding directories¶
This variable is especially useful would you need to bind a path that is not present in standard configuration. For example, on a cluster where your admin system set up a local scratch in /local/scratch, you can tell singularity to look there by binding this path into your container:
export DAMONA_SINGULARITY_OPTIONS="-B /local/scratch:/local/scratch"
Developer guide¶
Introduction¶
Developers are lucky: they can do more than users. If you type:
damona --help
you will have the users' commands. However, they are more commands available. They are not shown because they are intended for developers only.
The first useful command for developers is the build command:
damona build --help
The second is the zenodo-upload command:
damona zenodo-upload --help
All images will be posted on Zenodo if Singularity recipe is in Damona¶
The goal is to have a unique and official DOI for each tool.
git clone git@github.com/your_fork/damona
cd damona
Let us consider an example called SOFTWARE. You must be in the directory of the SOFTWARE package:
cd recipes/SOFTWARE
Warning
the following required registered token on Zenodo and will upload images on Zenodo as well ! Consider removing the --mode zenodo to try the sandbox version
Case 1: the tool does not exist.¶
Create a new Singularity image. Time to upload the resulting (functional !) image:
damona zenodo-upload SOFTWARE_1.0.0.img --mode zenodo
This command uploads the image on Zenodo with all correct metadata already pre-filled for you. It also creates a registry.yaml file with the metadata ready to commit and push. edit the registry file to add a binaries section if neeeded.
Case 2: the recipe exists already¶
Create a new Singularity image. Time to upload the resulting (functional !) image:
damona zenodo-upload SOFTWARE_2.0.0.img --mode zenodo
It updates the existing registry.yaml ready to commit and push
tree structure¶
Recipes are in the ./recipes directory with one sub-directory per tool or environment. Inside a sub directory (e.g, R, conda) you may have several recipes for different versions.
For example, for Damona there is a directory called Damona. Inside that directory, if there is only one recipes, name it:
Singularity.damona
If you wish to have several recipes for different version, name it:
Singularity.damona_x.y.z
Naming convention¶
A valid singularity image must have the following name:
Singularity.NAME_x.y.z
Singularity.NAME_SUFFIX_x.y.z
Underscore can be part of the name.
Images names for users will appear as:
NAME:x.y.z
NAME_SUFFIX:x.y.z
Note that NAME could be in small or big caps but the final image with be all lower caps (singularity-hub feature). Consequently, when downloading an image, it should be named as pkgname:x.y.z
building¶
To test the recipe, type:
damona build pkgname:x.y.z
This is just an alias to singularity build command:
sudo singularity build pkgname.img Singularity.pkgname_x.y.z
Singularity recipes¶
Here are some instructions to help writting recipes.
Try to set version instead of latest:
BootStrap: docker
From: mambaorg/micromamba:1.4.4
is better than
BootStrap: docker
From: mambaorg/micromamba:latest
By experience here are some conventions that could be useful. These commands are useful to avoid warnings when running the container
%environment
LANG=C.UTF-8
LC_ALL=C.UTF-8
export LANG LC_ALL
No need for labels but if you want, you may add a labels section:
%labels
whatever
No need for help section.
A useful set of commands is also to add test within the container but this is only tested when building the recipes:
- %test
command --help
registry¶
For each singularity, a registry is required. It contains a yaml file that looks like
fastqc:
0.11.9:
download: URL1
md5sum:
binaries: fastqc
0.11.8:
download: URL
md5sum:
binaries: fastqc
fastqc:
binaries: fastqc
0.11.9:
download: URL1
md5sum:
0.11.8:
download: URL
md5sum:
The download link can be of three types:
a valid URL
an image on the damona website. For instance with ucsc recipes, we stored it on the damona URL, which is:
download: damona::ucsc_0.1.0.img
it will look for the damona URL. This is an alias to https://biomics.pasteur.fr/salsa/damona/ucsc_0.1.0.img
an image stored on syslab.io:
library://cokelaer/damona/conda:4.7.12
Where are stored the containers ?¶
Since Dev 2021, we store containers with a DOI on Zenodo website. Originally, we stored some container here: https://cloud.sylabs.io/library/cokelaer/damona but we extended Damona so that it can fetch containers from other places. If you have your own containers, it is quite simple to create a registry and place it anywhere on the web and inform damona that you want to use that registry.
We have an example on https://biomics.pasteur.fr/salsa/damona
Build an image locally¶
Sometimes, the version you are looking for is not available. It is quite easy to rebuild the recipes yourself and store it locally.:
damona build Singularity.recipes
Again, this is just a wrapper around singularity build command. The advantage here is that we can use this command to buld a damona recipes:
damona build fastqc:0.11.9
You can then save the image elsewhere if you want:
damona build fastqc:0.11.9 --output-name ~/temp.img
This is nothing more than an alias to singularity itself:
singularity build recipes Singularity.recipes
More interesting is the ability to build a local version of a recipes to be found in damona:
damona build salmon:1.3.0
this will find the recipes automatically and save the final container in salmon_1.3.0.img.
Upload image on sylabs (DEPRECATED)¶
singularity build salmon.img Singularity.salmon_1.3.0
singularity sign salmon.img
singularity push salmon.img library://cokelaer/damona/salmon:1.3.0
What about reusing a docker image¶
You can. See for example the hisat2 image here: https://github.com/cokelaer/damona/tree/master/damona/recipes/hisat2
It looks like:
hisat2:
releases:
2.1.0:
download: docker://biocontainers/hisat2:v2.1.0-2-deb_cv1
binaries: hisat2 hisat2-build
md5sum: e680e5ab181e73a8b367693a7bd71098
Here, there is no zenodo link though because it is already on docker.
References¶
Image and Binary handlers. |
|
The Damona configuration |
admin module¶
builders module¶
common module¶
Image and Binary handlers. Provide also a Damona manager
- class BinaryReader(filename)[source]¶
Manage a single binary
>>> from damona.common import BinaryReader >>> br = BinaryReader("~/.config/damona/envs/base/bin/fastqc") >>> br.get_image() 'fastqc:0.11.9' >>> br.is_image_available() True
constructor
- Parameters:
filename (str) -- the input name of the binary file
Can be use to check whether the binary is not orphan and its image is still available.
- class Damona[source]¶
Global manager to get information about environments, binaries, images.
- property config_path¶
Get the Damona config file location
- damona_path¶
This attribute stored the path where images and environments are stored
- property environments_path¶
Get the Damona environments directory location
- find_orphan_binaries()[source]¶
Find binaries in all environments that are orphans
By orphans, we mean that their image is not present anymore for some reasons (e.g., users delete it manually).
- property images_directory¶
Get the Damona images directory location
- class DamonaInit[source]¶
Class to create images/bin directory for DAMONA
This is called each time damona is started to make sure the required config file are present.
This class simply create the ~/.config/damona/envs and images directories. It also checks whether DAMONA_PATH and DAMONA_SINGULARITY_OPTIONS variables are defined in the environment.
- class ImageReader(name)[source]¶
Manage a single Singularity image
Constructor
- Parameters:
name -- the input name of the image (fullpath)
>>> from damona.common import ImageReader >>> ir = ImageReader("~/.config/damona/images/fastqc_0.11.9.img") >>> ir.md5 >>> ir.is_orphan() >>> ir.name >>> print(ir.shortname) 'fastqc_0.11.9.img' >>> print(ir.version) '0.11.9'
- property guessed_executable¶
Guess the executable from the filename
- property md5¶
compute and return the md5 of the file
- property shortname¶
Get the filename (NAME_X.Y.Z.img)
- property version¶
Get the version
config module¶
The Damona configuration
- class Config(name='damona')[source]¶
A place holder to store our configuration file and shell scripts
This class is called each time damona is started. The config file, if not present is created, otherwise nothing happens. Same for the bash and fish shell configuration files
The damona configuration file looks like:
[general] quiet=False [urls] damona=https://..../registry.txt url1=https://..../registry.txt [zenodo] token=APmm6p.... orcid=0000-0001 name='Cokelaer, Thomas' affiliation='Institut Pasteur' [sandbox.zenodo] token=FFmbAEhQbb... orcid=0000-0001 name='Cokelaer, Thomas' affiliation='Institut Pasteur'
Where the urls section can be used to store aliases to external registry. When installing software using:
damona install example --from url damona
if the alias damona is in the [urls] section, it is replaced by its real value (https://...) the URL must end with the expected registry name registry.txt
The zenodo section is not save by default since it is for developpers only.
environ module¶
install module¶
registry module¶
zenodo module¶
DAMONA standalone (script module)¶
FAQs¶
A Fatal error: cannot open file occured but the file is visible¶
It could be that your file is on a NFS directories, which is not visible in the container.
You can fix this by setting the DAMONA_SINGULARITY_OPTIONS variable. This variable can be set to pass any singularity options to all binaries installed by Damona.
For instance, if you have a NFS mounted directory in /mnt/my_space, you can bind it in singularity and therefore DAMONA using (in a bash shell):
export DAMONA_SINGULARITY_OPTIONS=" -B /mnt/my_space:/mnt/my_space"
or within fishshell:
set DAMONA_SINGULARITY_OPTIONS "-B /mnt/my_space:/mnt/my_space"
Why Damona and not conda/mamba ?¶
Damona is not meant to replace conda/bioconda that have a great community and a large number of packages available. It is a complementary tool that is meant to be super-easy and provide reproducible environments as a set of singularity images.
Conda is great but in practice we faced some difficulties. First, as a developper using tens or hundreds of binaries for real-life applications, it was not uncommon to break conda environments when installing a new software. Also you can now come back to previous versions it was time-consuming for the team. The second common problem we had was the ability to share identical environments where software were identical between the different developers to ensure reproducible environments and analysis for our customers/users.
That is where Damona started. We could share images where entire environments with a set of binaries would be available to all in the exact same way.
As a developper you can then use conda to try a new or complementary software while keeping your core software identical between environments.