Skip to content

Development Roadmap

  • Zenodo automatic download of external files + indexes (1.2.1)
  • Multiple samples in the parent folder (1.2.2)
  • Automatic testing of BAM SM tag compared to sample folder name (1.2.3)
  • On-error/success e-mail (1.3)
  • HPC execution (slurm profile for the moment) (1.3)
  • Full singularity image with preinstalled conda envs (1.5.1)
  • Single BAM folder with side config file (1.6.1)
  • (EMBL) GeneCore mode of execution: allow selection and execution directly by specifying genecore run folder (2022-11-02-H372MAFX5 for instance) (1.8.2)
  • Version synchronisation between ashleys-qc-pipeline and mosaicatcher-pipeline (1.8.3)
  • Report captions update (1.8.5)
  • Clustering plot (heatmap) & SV calls plot update (1.8.6)
  • ashleys_pipeline_only parameter: using mosaicatcher-pipeline, trigger ashleys-qc-pipeline only and will stop after the generation of the counts, ashleys predictions & plots to allow the user manual reviewing/selection of the cells to be processed (2.2.0)
  • Target alternative execution ending: breakpointr_only parameter to stop the execution after breakpointR ; whatshap_only parameter to stop the execution after whatshap (2.3.3)
  • Snakemake v9 + Pixi migration: unified package management and reproducible environments (2.4.0)
  • Assembly-specific containers on GHCR: one image per reference genome (hg38, hg19, T2T, mm10, mm39) embedding the matching BSgenome R package (2.4.0)
  • Centralized version management with automated bumping (pixi run bump-patch|bump-minor|bump-major|bump-beta) and changelog generation (2.4.0)
  • HPC storage optimization: configurable reference_base_dir for multi-user reference genome sharing (2.4.0)
  • New EMBL HPC Apptainer profile: workflow/snakemake_profiles/mosaicatcher-pipeline/v9/HPC/slurm_EMBL_apptainer/ (2.4.0)
  • Plotting options (enable/disable segmentation back colors)
  • Self-handling of low-coverage cells (1.6.1)
  • Upstream ashleys-qc-pipeline and FASTQ handle (1.6.1)
  • Change of reference genome (currently only GRCh38) (1.7.0)
  • Ploidy detection at the segment and the chromosome level: used to bypass StrandPhaseR if more than half of a chromosome is haploid (1.7.0)
  • inpub_bam_legacy mode (bam/selected folders) (1.8.4)
  • Blacklist regions files for T2T & hg19 (1.8.5)
  • ArbiGent integration: Strand-Seq based genotyper to study SV containly at least 500bp of uniquely mappable sequence (1.9.0)
  • scNOVA integration: Strand-Seq Single-Cell Nucleosome Occupancy and genetic Variation Analysis (1.9.2)
  • multistep_normalisation and multistep_normalisation_for_SV_calling parameters to replace GC analysis module (library size normalisation, GC correction, Variance Stabilising Transformation) (2.1.1)
  • Strand-Seq processing based on mm10 assembly (2.1.2)
  • UCSC ready to use file generation including counts & SV calls (2.1.2)
  • blacklist_regions parameter: (2.2.0)
  • IGV ready to use XML session generation: (2.2.2)
  • BreakpointR integration through breakpointr parameter (2.3.3)
  • ashleys-qc-pipeline fully integrated (no longer a git submodule): preprocessing lives directly in mosaicatcher-pipeline (2.4.0)
  • mm39 full support: normalization files, blacklist regions, and BSgenome package for mouse GRCm39 assembly (2.4.0)
  • CanFam (canfam3/canfam4) framework-ready: reference infrastructure in place, normalization files pending (2.4.0)
  • Pre-built iGenomes index download (download_prebuilt_indexes parameter): skip local BWA index building by downloading from AWS iGenomes (2.4.0)
  • Ploidy estimation module (ploidy parameter): optional detection of haploid chromosomes/segments to guide StrandPhaseR (2.4.0)
  • keep_ashleys_predictions parameter: control retention of ashleys ML prediction files (2.4.0)
  • Pooled samples

Small issues to fix

  • replace input_bam_location by data_location (harmonization with ashleys-qc-pipeline)
  • List of commands available through list_commands parameter (1.8.6)
  • Move pysam / SM tag comparison script to snakemake rule (2.2.0)
  • Reference properly reference genome in IGV session script generation (2.3.5)