Skip to main content

Sourmash

Quickly searches, compares, and analyzes genomic and metagenomic data sets

https://github.com/sourmash-bio/sourmash

The module can summarise data from the following sourmash output files (descriptions from command line help output):

  • sourmash compare
    • create a similarity matrix comparing many samples.
  • sourmash gather
    • search a metagenome signature against databases.

Additional information on sourmash and its outputs is available on the sourmash documentation website.

sourmash gather is modelled after the Kraken module, and builds a bar graph that shows the coverage of top-5 genomes covered most by all samples. The number of top genomes can be customized in the config file:

sourmash:
gather:
top_n: 5

File search patterns

sourmash/compare:
fn: "*.labels.txt"
sourmash/gather:
contents: intersect_bp,f_orig_query,f_match,f_unique_to_query,f_unique_weighted,
num_lines: 1