Installation & Execution
Melange is designed as a Snakemake workflow that allows all steps to be executed in parallel on a cluster.
Step 0: Melange dependencies
To run Melange you need to have conda (or the simplest version - miniconda), Snakemake and Git installed.
Install conda
To install conda, follow the instructions in conda documentation: Conda. Most users will probably want to install Miniconda.
If you have not already done so, you will need to configure conda with the bioconda-channel and the conda-forge channel:
conda config --add channels defaults
conda config --add channels bioconda
conda config --add channels conda-forge
Install mamba (optional)
Conda can be a bit slow because there are so many packages. A good way around this is to use Mamba (another snake).
conda install mamba
From now on you can replace conda install
with mamba install
(check how much faster this snake is!)
Install snakemake
After installing conda (and optionally mamba), install Snakemake:
mamba create -c conda-forge -c bioconda -n snakemake snakemake
conda activate snakemake
Install git
To run Melange, you need to have git installed to clone the Melange repository.
Instructions for installing git can be found at: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git
Step 1: Clone Melange workflow
To use Melange, you need a local copy of the Melange workflow repository. Start by creating a clone of the repository:
git clone https://github.com/sandragodinhosilva/melange.git
Now you should have a folder called melange. In it you will find everything you need to run this workflow. To enter inside:
cd melange
Optional: Test the correct installation with sample data
To test the correct installation of Melange, you can use example data. This data will be downloaded automatically when you clone the Melange repository. Simply ensure the following setting in the config.yaml
file:
# --- Input
inputdir: "example_data"
Test your configuration by doing a dry-run via:
snakemake --use-conda -n
Step 2: Configure workflow
Configure the workflow according to your needs by editing the file config.yaml
.
To edit the config.yaml
file you can use a text editor of your choice.
For example with nano:
nano config.yaml
## Useful commands:
Ctrl+O Offer to write file ("Save as")
Ctrl+X Close buffer, exit from nano
For more information on customising this configuration file, see the section Melange Configuration
Step 3: Execute workflow
Execute the workflow locally via
snakemake --use-conda --cores N
This will run the workflow locally using N
cores.
Optional steps
Examine workflow:
Snakemake has some cool features implemented in Melange. One of them is the ability to automatically create a directed acyclic graph (DAG) of jobs that allows visualisation of the entire workflow.
By executing a single command:
snakemake --dag | dot -Tsvg > dag.svg
A DAG (saved as an .svg image) is created. It contains a node for each order, with the edges connecting them representing the dependencies. The frames of jobs that do not need to be executed (because their output is up to date) are dashed.
Example:
Investigate results:
After successful execution, you can create a self-contained interactive HTML report with all results via:
snakemake --report report.html
Extra: Run Melange on a high performance cluster
Snakemake can make use of cluster engines. In this case, Snakemake simply needs to be given a submit command that accepts a shell script as first positional argument:
snakemake --cluster qsub --use-conda --jobs 4