SCRAM

SCRAM is lightweight Python package for aligning small RNA reads to one or more reference sequences and producing publication-quality images.

Developed by Stephen Fletcher @ the laboratory of Prof. Bernie Carroll, University of Queensland

Installation

Scram is written in Python 3.5. Install scram and its dependencies with pip:

pip install scram

Or download and extract the tarball, then run:

python setup.py install

Input File Format

Reference File : DNA nucleotides only (AGCT) - FASTA format

Sequence File : Collapsed (unique) reads - DNA nucleotides only (AGCT) - FASTA format

Post-processing of FASTQ reads to collapsed FASTA format can be carried out using the FASTX-Toolkit from the Hannon Lab. Collapsed reads are unique, and contain the read count in the header.

An example of the required sequence file format:
head seq1.fa >1-607041 TCGGACCAGGCATCATTCCCC >2-202886 TCGGACCAGGCTTCATACCCC >3-71446 TCCCAAATATAGACAAAGCA

Usage

scram analysis_type reference_file [-h] [-s1 [SEQ_FILE_1 [SEQ_FILE_1 ...]]] [-s2 [SEQ_FILE_2 [SEQ_FILE_2 ...]]] [-nt SRNA_LEN] [-f FILE_NAME] [-p PROCESSES] [-min_read MIN_READ_SIZE] [-max_read MAX_READ_SIZE] [-min_count MIN_READ_COUNT] [-win SMOOTH_WIN_SIZE] [-ylim YLIM] [-no_csv] [-no_display] [-split] [-pub] [-V]

Analysis types

den : align reads of a single sRNA class (eg. 21 nt) from one or more replicate sequence files to a single reference sequence (-s1 and -nt required)
mnt3dm : align 21, 22 and 24 nt reads from one or more replicate sequence files to a single reference sequence (-s1 required)
CDP : count aligned reads of a single sRNA class (eg. 21 nt) to multiple reference sequences. Counts for two sets of one or more replicate sequence files are plotted as (x,y) coordinates for each reference (-s1, -s2 and -nt required)
CDP_single : count aligned reads of a single sRNA class (eg. 21 nt) to multiple reference sequences. Counts are written to a .csv file with a column for each read file. No plot output. (-s1 and -nt required)

Flags

-h : Help message
-s1 : Sequence file 1 (if more than 1 replicate sequence file, read counts are averaged)
-s2 : Sequence file 2 (if more than 1 replicate sequence file, read counts are averaged)
-nt : sRNA length to analyse
-p : number of processes (CPU cores) to use in CDP analyses (default=4)
-f : Figure output file name (if not auto-generated). Use 'auto' to auto-generate
-min_read : Minimum length of sRNA reads used for normalisation (default=18)
-max_read : Maximum length of sRNA reads used for normalisation (default=32)
-min_count : Minimum read count for an sRNA to be aligned and used for normalisation (default=1)
-win : Window size for smoothing of den plots (default=50, min = 6)
-ylim : +/- y-axis limit on plots
-no_display : Do not display plot on screen
-no_csv : Do not generate the .csv alignment file
-split : Split read alignment counts based on no. of alignments (i.e. if a read aligned in 2 positions, the read count at each position is halved)
-pub : Remove all axis labels and legends for preperation of figures for publication
-V : Show program's version number and exit

den Example

scram den ./ref.fa -s1 seq1.fa -nt 24 -win 30 -f fig1.pdf
example 1

mnt3dm Example

scram mnt3dm ./ref.fa -s1 seq1.fa -win 20 -ylim 110 -f fig2.pdf

example 2

CDP Example

scram CDP ./cDNAs.fa -s1 seq1.fa -s2 seq2.fa -nt 21 -f fig3.pdf -split
example 3

SCRAM

Small Complementary RnA Mapper : a quick and simple small RNA read analysis package

SCRAM

Installation

Input File Format

Usage

den Example

mnt3dm Example

CDP Example