Salmon vs kallisto on RNA-seq benchmarking datasets

As part of a presentation on bias modeling for RNA-seq data, I put together a Shiny app that allows one to visually explore the differences between Salmon and kallisto on all samples for the RNA-seq datasets GEUVADIS and SEQC, the former a large RNA-seq project with annotated batches and sequencing centers and the latter a large benchmarking RNA-seq project. Running the two programs on these real, large datasets reveals many differences in transcript abundance quantification, in particular stability and reliability of abundance estimates across sequencing center.

For all datasets, we see that Salmon has more consistent estimation across the labs performing the experiment, as described in detail in the Salmon paper. You can click on the overview plots to explore the estimated counts by different methods for individual genes, where often the inconsistencies are driven by mis-estimation of which isoform is expressed within a gene.

  • The Salmon kallisto diffs GitHub repository where you can find the Shiny app comparing Salmon and kallisto abundance estimates on these benchmarking datasets.

Example images (link to presentation slides):

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s