STAR –waspOutputMode SAMtag vs WASP

This time, we will compare STAR –waspOutputMode SAMtag vs WASP. Both of them perform allele-specific read mapping.

STAR

STAR is a well-known free software that align reads to a reference genome.

vA tag: variant allele

  • 1 means variant base in the read is the reference allele
  • 2 means variant base in the read is the alternative allele
  • 3 means variant base in the read is another base than reference and alternative allele
  • 4 means variant base in the read is N

Examples of reads overlapped by 1, 2, 4 or 8 variants, respectively:

  • vA:B:c,3
  • vA:B:c,2,1
  • vA:B:c,1,2,1,4
  • vA:B:c,3,2,2,2,2,2,2,2

vG tag: 0-based genomic coordinate of the variant

Examples of reads overlapped by 1, 2, 4 or 8 variants, respectively:

  • vG:B:i,965124
  • vG:B:i,965349,965349
  • vG:B:i,1013489,1013540,1014227,1014273
  • vG:B:i,41303372,41303440,41303441,41303473,41303485,41303499,41303661,41303664

vW tag: result of WASP filtering

  • 1 means alignment passed (vW:i:1)
  • 2 means multi-mapping read (vW:i:2)
  • 3 means variant base in the read is N (vW:i:3)
  • 4 means remapped read dit not map (vW:i:4)
  • 5 means remapped read multi-maps (vW:i:5)
  • 6 means remapped read maps to a different locus (vW:i:6)
  • 7 means read overlaps more than 10 variants (vW:i:7)

WASP

WASP is a pipeline that corrects for allelic mapping biases, among other things.

STAR vs WASP

STAR (version 2.7.1a)WASP (version 0.3.4)
removes reads that overlap indelsnoyes
is able to change the
maximum number of SNPs
that can overlap a read
no
(default = 10)
yes
(default = 6)
is able to take phase
information into account
noyes
algorithms usedSTAR
grep
HTSeq
snp2h5 or extract_vcf_snps.sh
STAR
find_intersecting_snps.py
STAR
filter_remapped_reads.py
samtools merge
samtools sort
samtools index
HTSeq

None of them are able to use reads that overlap insertions / deletions (indels): WASP removes those reads while STAR ignores the indels. However, we can:

  • keep reads that overlap indels with WASP by removing indels from the VCF file
  • remove reads that overlap indels with STAR by looking at position of the reads

Conclusion

In conclusion, WASP (the original algorithm) is configurable but it requires many tools and is very slow compared to STAR –waspOutputMode (the re-implementation).

Related posts

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply