compariSeq: rethinking sequence logos

by Sean McKenna, Philip S. Quinan, Alex Bigelow
compariSeq sequence logos visualization

One of the most common methods to display biological sequences are the sequence logo. At each location in a sequence, both the relative frequency of amino acids and information content (in bits) are encoded. However, sequence logos have several major flaws, due to the inherent stacked bar chart and color choices. We introduce a redesign of traditional sequence logos: Comparing Sequence Charts (compariSeq), designed for the task of comparing multiple biological sequences. This task was motivated by several informal interviews with three biologists of varying expertise. These interviews yielded several specific observations, which inspired the design of compariSeq. All relevant data encoded in traditional sequence logos is conserved, however, attention is directed to the most important data, colors are more perceptually accessible, and the task of direct comparison at particular locations is supported.