Computer aided selection of SCCmec typing marker sets
Objectives
SCCmec is the defining feature of MRSA. It is a large mobile genetic element that exhibits complex diversity. SCCmec typing and subtyping is extensively used in combination with multilocus sequence typing to define MRSA clones. In general the five SCCmec types are defined on the basis of the combination ccr and mec classes, and the subtypes defined by variation involving recombination and the activity of mobile genetic elements in the “junkyard” regions. However, as more SCCmec elements are analysed, more homoplastic variation is revealed, and it is increasingly difficult to define an internally consistent hierarchical classification system. The objective of this study was to take a deliberately naïve approach to the development of an SCCmec typing method, with the hypothesis being that this will be straightforward and the marker set will be highly discriminatory.
Methods
Every known SCCmec variant was regarded as an array of present or absent binary elements, and converted into a pseudo-DNA sequence. In this way the sum of knowledge regarding SCCmec diversity was converted into a pseudo-sequence alignment. This was analysed using the computer program Minimum SNPs so as to identify sets of binary markers with optimised Simpson’s Index of Diversity (D).
Results
The sets of binary markers obtained possessed greater resolving power than the conventional marker sets. For example, six binary targets can resolve 21 SCCmec variants. This emphasises the homoplastic nature of SCCmec variation and consequent difficulty of generating a strictly hierarchical typing scheme. Nevertheless, the types defined by this marker set were broadly consistent with accepted SCCmec nomenclature, and it was possible to modify the marker set slightly so to make concordance with accepted nomenclature essentially complete. A third marker set was developed that was biased towards resolving the most commonly encountered variants. A key for conveniently working backwards from binary marker genotypes to SCCmec types was assembled. Finally, it was demonstrated that mecA single nucleotide polymorphisms (SNPs) could be added to the pseudo-sequence alignment so as to make an alignment that contains both binary marker and SNP information. Analysis of this using Minimum SNPs provided a D-maximised marker set that is a mixture of binary markers and SNPs.
Conclusions
Computerized derivation of markers for SCCmec typing is conceptually simple and highly effective.