Counts the number of spliced reads (or splice junction mapping reads) in a given RNA-Seq reads genomic alignment bam file.
USAGE: perl CountSplicedReads.pl input_bam remove_temporary_files
|input_bam||Input bam file|
|remove_temporary_files||T/F default value is ‘T’ (true)|
Determines quality value encoding format in a given fastq file.
USAGE: perl DetermineFastqQualityEncoding.pl fq
|fq||Input fastq file|
Validates order of paired-end reads in given fastq files.
USAGE: perl FastqPairedEndValidator.pl fq1 fq2
|fq1||Fastq format file for the first (left) pair|
|fq2||Fastq format file for the second (right) pair|
Adds \1 and \2 suffix (tags) to the first (left) and second (right) pairs of paired-end read names respectively in given fastq files.
USAGE: perl AddPairedEndSuffix.pl fq_in fq_out pair_tag
|fq_in||Input fastq format file|
|fq_out||Output fastq format file|
|pair_tag||Paired-end tag which is to be added (1 for left-end pair and 2 for right-end pair)|
Converts paired-end fastq files to a merged and sorted (on read name) SAM file.
USAGE: perl FqToSamPicard.pl fq1 fq2 out_tag quality_format
|fq1||Fastq format file for the first pair|
|fq2||Fastq format file for the second pair|
|out_tag||A string for the output SAM file suffix. Resulting file will be out_tag.merged.sorted.sam|
|quality_format||Fastq quality scale (“Standard”, “Solexa”, “Illumina”)|
Note: Picard tools are required for this program. Path to Picard directory can be set in the script (default path is /usr/bin/).
Picard’s FastqToSam program will automatically convert quality values in “Standard” or Phred scale.
Fixes the order of appearance of paired-end reads in fastq files using a merged SAM files and also separates unpaired reads.
The merged SAM file is generated using paired-end read fastq files using Picard tools FastqToBam program followed by MergeSam program. See instructions here.
USAGE: perl UnmappedSamToFastq.pl mergedFqSam out_tag
|mergedFqSam||Merged and read name soreted SAM file generated from the raw (unordered) fastq files.|
|out_tag||A string for the output SAM file suffix. Resulting files will be: out_tag_1.fastq (left pair), out_tag_2.fastq (right pair) and out_tag.fastw (unpaired) reads.|
Fix the order of appearance of paired-end reads in fastq format files:
Let’s say fastq1 and fastq2 files contain left and right mates of the paired-end reads, respectively.
- Convert fastq files to SAM files (individually) using Picard’s FastqToSam file.
- Merge the two SAM files into one using the Picard’s MergeSamFiles and set the sorting order as “queryname”.
- Supply the merged sam file to UnmappedSamToFastq in order to obtain paired-end fastq files and the third fastq files with unpaired reads (reads for which a pair was not found).
Step 1-2 can be perfoemed by the FqToSamPicard (described above).
To check if the reads in a given pair of fastq files are in correct order, run FastqPairedEndValidator (described above).
Please note that the UnmappedSamToFastq program expects “\1” and “\2” tags in the read names to distinguish between left and right end reads, respectively. If your fastq files do not
contain these tag, please run AddPairedEndSuffix (described above) in order to add these tags to the reads in your fastq files.
FqToSamPicard also expects the quality encoding format of the fastq files in order to run Picard tools. This format can be determined
using the program: DetermineFastqQualityEncoding (described above).
(Sanger or Illumina 1.9+ => “Standard”, Illumina 1.5+ => “Illumina”, Illumina 1.3+ => “Illumina”, Solexa => “Solexa”)
Overall workflow has been shown below: