Installation#

AmpSeeker does not require installation, however, users must have snakemake and conda installed in order to run the workflow.

Step-by-step Installation#

1. Install Miniforge#

Miniforge is a minimal installer for conda with preinstalled conda-forge packages. It’s recommended over Miniconda or Anaconda because it uses conda-forge as the default channel, which is important for bioinformatics packages.

Download and install Miniforge from: conda-forge/miniforge

For Linux:

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh

For macOS (Intel):

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-x86_64.sh
bash Miniforge3-MacOSX-x86_64.sh

For macOS (Apple Silicon/M1/M2):

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh
bash Miniforge3-MacOSX-arm64.sh

2. Create a Conda Environment for Snakemake#

After installing Miniforge, set up a conda environment specifically for Snakemake and AmpSeeker:

conda create -n ampseeker -c conda-forge -c bioconda snakemake pandas
conda activate ampseeker

This creates a dedicated environment called “ampseeker” with Snakemake and pandas installed.

3. Clone the AmpSeeker Repository#

Next, clone the AmpSeeker github repository to your system. You may use the below command, replacing my_repo_name with something of your choice. This may be useful in case you ever use AmpSeeker on multiple datasets.

git clone https://github.com/sanjaynagi/AmpSeeker.git my_repo_name
cd my_repo_name

Alternatively, if you think you may want to contribute to AmpSeeker, you may wish to fork the repository to your github account prior to cloning. For more information, please see the docs on contributing.

4. Installing Dependencies#

When we run AmpSeeker, it is important to use the --use-conda directive - this will automatically install all necessary packages into isolated software environments, so we do not have to install any other dependencies manually.

Each snakemake rule in AmpSeeker specifies its own conda environment, which will be created the first time you run the workflow with the --use-conda flag.

5. Testing the Setup#

To test whether AmpSeeker is configured correctly, from the root directory you can perform a dry-run on the test dataset, with the following command:

conda activate ampseeker  # Make sure the environment is activated
snakemake --cores 1 --directory .test/ --use-conda --configfile .test/config/config_fastqauto.yaml -n

The -n flag performs a dry run, showing which steps would be executed without actually running them.

Additional Resources#

Conda Documentation More information about conda: https://conda.io/projects/conda/en/latest/user-guide/index.html

Snakemake Documentation Detailed documentation for Snakemake: https://snakemake.readthedocs.io/