Bacterial Genomics - Summary Report - Mock Data
Overview New Samples
There are 5 new sequencing results in this BigBacter run.
This is a summary table which includes sample metadata and mapping of the sequencing ID to the corresponding WDRS CASE_ID, if available
| WA ID | ALT ID | WDRS ID | Collection Date | Patient County | Submitter County | Submitter Facility |
|---|---|---|---|---|---|---|
| WA1000000 | 2024JQ-00001 | 100000000 | 2024-01-01 | Thurston | Spokane | A Hospital |
| WA1200000 | 2024JQ-00002 | 200000000 | 2024-01-01 | King | Pierce | A Laboratory |
| WA1300000 | 2024JQ-00003 | 300000000 | 2024-01-01 | Whatcom | Whatcom | Microbiology LHJ |
| WA1400000 | 2024JQ-00004 | 400000000 | 2024-01-01 | Snohomish | King | B Hospital |
| WA1500000 | 2024JQ-00005 | 500000000 | 2024-01-01 | Pierce | King | B Laboratory |
The new isolates were analyzed using Gubbins and classified by BigBacter as follows:
| ID | QUAL | TAXA | GENOMIC CLUSTER | CLUSTER (n=) | PARTITIONGubbins | PARTITION (n=) |
|---|---|---|---|---|---|---|
| WA1000000 | PASS | Klebsiella_pneumoniae | 4 | 22 | 5 | 1 |
| WA1200000 | PASS | Klebsiella_pneumoniae | 4 | 22 | 4 | 3 |
| WA1300000 | PASS | Klebsiella_pneumoniae | 208 | 1 | NA | NA |
| WA1400000 | PASS | Klebsiella_pneumoniae | 301 | 6 | 3 | 1 |
| WA1500000 | PASS | Klebsiella_pneumoniae | 318 | 1 | NA | NA |
Sequences that resulted in new genetic clusters are excluded from tree partitioning.
Of these, the following isolates resulted in new genetic clusters:
| ID | QUAL | TAXA | GENOMIC CLUSTER | CLUSTER (n=) |
|---|---|---|---|---|
| WA1300000 | PASS | Klebsiella_pneumoniae | 208 | 1 |
| WA1500000 | PASS | Klebsiella_pneumoniae | 318 | 1 |
Failed Isolates
The following isolates failed quality control.
Recombination
Bacterial recombination is the process where bacteria exchange genetic material with each other which leads to the gain of new DNA sequences into their genomes. It is important to be aware of recombination when conducting genomic analyses because recombination events can be confused with mutations events which can impact metrics used to characterize relationships between sequences, such as calculating single nucleotide polymorphisms (SNP) distances. The bioinformatics pipelines developed at WA PHL use Gubbins, a method to detect and control for recombination. If recombination is detected the sites where recombination is present are masked in the SNPs distance calculations and in the phylogenetic trees.
We evaluate recombination in multiple ways. First the number of sites where recombination was detected is divided by the total length of the core genome. If recombination is more than 5% in a genomic cluster the Gubbins outputs are used. If recombination is more than 1% but less than 5%, then the Snippy and Gubbins outputs are reviewed jointly to see if they yield different interpretations. If the interpretations differ, then most likely we will use the Gubbins for the genomic interpretations.
| TAXA | GENOMIC_CLUSTER | MAX_%Recomb_Detected |
|---|---|---|
| Klebsiella_pneumoniae | 4 | 7.038 |
| Klebsiella_pneumoniae | 301 | 1.803 |
Sequences that resulted in new genetic clusters are excluded from this calculation.
SNP Min and Max Distances
The minimum and maximum SNP distances calculated using Gubbins are summarized below.
| Source | MAX | MIN |
|---|---|---|
| 1770000000-Klebsiella_pneumoniae-00004-core-snps_dist.gubbins-long | 315 | 1 |
| 1770000000-Klebsiella_pneumoniae-00301-core-snps_dist.gubbins-long | 25638 | 62 |
Sequences that resulted in new genetic clusters are excluded from this calculation.
Genomic Linkages
Based on the SNP distances calculated using Gubbins, the following very strong (0–5 SNPs), strong (6–10 SNPs), and intermediate (11–50 SNPs) genomic linkages were identified between the new isolate(s) and other sequences within the corresponding genomic clusters.
| ID | VeryStrongGenLinks (0-5 SNP) | StrongGenLinks (6-10 SNP) | InterGenLinks (11-50 SNP) |
|---|---|---|---|
| WA1200000 | WA0500000 | WA0800000, WA0900000 |
Metadata
This is an overview of the metadata pertaining to each of the genomic clusters that contain new isolates. The facilities are the submitting facilities and the counties the submitting facilities’ county.
| Taxa_GenomicCluster | Min_CollDate | Max_CollDate | All_Counties | New_Counties | All_Facilities | New_Facilities | All_IDs | New_IDs | Same_DOB_Isolates |
|---|---|---|---|---|---|---|---|---|---|
| Klebsiella_pneumoniae_208 | 01-30-2022 | 2024-01-01 | Skagit, Whatcom | Whatcom | A Hospital,X Laboratory | A Hospital | WA1300000 | WA1300000 | No isolates from the same case |
| Klebsiella_pneumoniae_301 | 02-30-2020 | 2024-01-01 | King | King | B Hospital | B Hospital | WA1400000 | WA1400000 | No isolates from the same case |
| Klebsiella_pneumoniae_318 | 03-30-2023 | 2024-01-01 | Pierce, Snohomish, King | King | C Hospital,General Hospital,A Laboratory | A Laboratory | WA1500000 | WA1500000 | No isolates from the same case |
| Klebsiella_pneumoniae_4 | 04-30-2021 | 2024-01-01 | Chelan, Grant, Spokane | Spokane | Medical Center, D Health, B Laboratory | B Laboratory | WA1200000, WA1000000 | WA1200000, WA1000000 | DOB: 1999-01-01 IDs: WA0800000, WA0900000 |
Isolates listed as having the same DOB might or might not be isolates from the same case. Check against epi data to confirm isolates listed are indeed from the same case.
Resources
The code to generate this report is available here:
https://github.com/NW-PaGe/BacterialGenomicsSummaryOutput
The following bioinformatics methods were used by WA PHL to generate some of the data summarized in this report:
BigBacter bioinformatics pipeline https://github.com/doh-jdj0303/bigbacter-nf
Snippy https://github.com/tseemann/snippy
Gubbins https://github.com/nickjcroucher/gubbins