📦 Installation & Update
Ashleys-QC integration
From version 2.4.0 onwards, ashleys-qc preprocessing is directly integrated into MosaiCatcher. You only need to clone mosaicatcher-pipeline. The separate ashleys-qc-pipeline repository is now legacy.
System requirements
This workflow is meant to be run in a Unix-based operating system when using the local execution profiles (tested on Ubuntu 18.04 & CentOS 7).
Minimum system requirements vary based on the use case. We highly recommend running it in a server environment with 32+GB RAM and 12+ cores.
Aside local execution, the pipeline can be run on a HPC cluster. The pipeline has been tested on SLURM-based clusters.
Installation
0. [Optional] Install Apptainer
In order to run the pipeline, you can use the Apptainer containerization tool. This will allow you to run the pipeline in a controlled environment, with all the dependencies installed and ready to use.
MosaiCatcher provides pre-built container images on GitHub Container Registry (GHCR). Each container is assembly-specific because it embeds the corresponding BSgenome R package required for haplotype analysis.
Available container tags:
| Reference Genome | Container URI |
|---|---|
| hg38 (GRCh38) | ghcr.io/friendsofstrandseq/mosaicatcher-pipeline:hg38-v2.5.0 |
| hg19 (GRCh37) | ghcr.io/friendsofstrandseq/mosaicatcher-pipeline:hg19-v2.5.0 |
| T2T (CHM13) | ghcr.io/friendsofstrandseq/mosaicatcher-pipeline:T2T-v2.5.0 |
| mm10 | ghcr.io/friendsofstrandseq/mosaicatcher-pipeline:mm10-v2.5.0 |
| mm39 | ghcr.io/friendsofstrandseq/mosaicatcher-pipeline:mm39-v2.5.0 |
Why assembly-specific containers?
Each container embeds the BSgenome R package for its reference genome. Splitting by assembly keeps image sizes manageable while ensuring StrandPhaseR has the correct genomic data available at runtime.
Use --sdm conda apptainer in your Snakemake command to activate container execution. See Usage for examples.
1. Install snakemake through conda or pixi
Snakemake version compatibility
- Until v2.3.5: Compatible with Snakemake v7
- From v2.4.0+: Compatible with Snakemake v9+
Option A: Conda
conda create -n snakemake-env \
-c conda-forge -c bioconda snakemake==9
conda activate snakemake-env
Or with pip in a virtual environment:
python3 -m venv snakemake-env
source snakemake-env/bin/activate
pip install snakemake==9
Option B: Pixi (recommended)
Pixi automatically manages dependencies and environments, eliminating version conflicts:
pixi run snakemake --version
Snakemake v9 plugins
Snakemake v9 uses a plugin system for storage backends and cluster executors. Install the required and recommended plugins after installing Snakemake:
# Required — used to download reference files over HTTP
pip install snakemake-storage-plugin-http
# Recommended for HPC — SLURM executor
pip install snakemake-executor-plugin-slurm
With Pixi, these are managed automatically via pixi.toml.
EMBL users
See the EMBL HPC Guide for the recommended Snakemake version and SLURM+Apptainer setup.
2. Clone the repository and its submodules
# Clone the repository and its submodules
git clone --recurse-submodules https://github.com/friendsofstrandseq/mosaicatcher-pipeline.git && cd mosaicatcher-pipeline
# In each submodule, initialize and pull
git submodule update --init --remote --force --recursive
Pipeline update procedure
If you already use a previous version of mosaicatcher-pipeline, follow these steps:
- Fetch all tags and remote refs:
git fetch --all --tags
- Check out the target version tag (this puts you in detached HEAD — expected for a release):
git checkout v<VERSION>
- Update git submodules:
git submodule update --init --remote --force --recursive
- Check Snakemake compatibility for the version you're updating to (see warning above).