Installation#

RNA-Seq-Pop does not require installation, however, users must have snakemake and conda installed in order to run the workflow.

Conda

Conda is an open-source package management system, which allows users to install a variety of packages and programs whilst managing software environments and dependencies. In general, we recommend using the miniconda installation.

https://conda.io/projects/conda/en/latest/user-guide/install/index.html

Snakemake

Snakemake is a workflow management system to create reproducible and scalable computational analyses. RNA-Seq-Pop is written in snakemake, which itself is python-based. The Snakemake authors encourage installation through conda.

https://snakemake.readthedocs.io/en/stable/getting_started/installation.html

You may also need pandas, which you can install with pip install pandas.


RNA-Seq-Pop#

The next step is to clone the RNA-Seq-Pop github repository to your system. You may use the below command, replacing my_repo_name with something of your choice. This may be useful in case you ever use RNA-Seq-Pop on multiple datasets.

git clone --recursive https://github.com/sanjaynagi/rna-seq-pop.git my_repo_name

Alternatively, if you think you may want to contribute to RNA-Seq-Pop, you may wish to fork the repository to your github account prior to cloning. For more information, please the docs on contributing.

As the workflow contains a submodule (compkaryo), we should use git clone --recursive to clone the repository.

Installing dependencies#

When we run RNA-Seq-Pop, it is important to use the --use-conda directive - this will automatically install all necessary packages into isolated software environments, so we do not have to install any other dependencies.

Please also ensure your conda has channel priorities set to ‘flexible’, otherwise, conda may throw up errors when installing packages. You can do this with the following command:

conda config --set channel_priority flexible

Testing the setup#

To test whether RNA-Seq-Pop is configured correctly, from the root directory you can perform a dry-run on the test dataset, with the following command:

snakemake --cores 1 --directory .test/ --use-conda --configfile .test/config/config_paired_end.yaml -n