The continuing evolution of seasonal influenza viruses leads to recurring epidemics and significant mortality rates globally and the need for updated vaccines annually. Co-occurring mutations in the surface glycoproteins haemagglutinin (HA) and neuraminidase (NA) are suggested to have synergistic interactions with a combined fitness benefit or help stabilize one mutation by another. Antigenic drift is a major contributor to the changes in the HA and NA glycoproteins often resulting in immune escape. In this study, we analysed the HA and NA proteins in influenza virus A/H3N2 and A/H1N1 to identify and understanding the relationships of co-occurring mutations and temporal relationships and antigenic evolution.
Based on Association Rule Mining, our tool detected a total of 64 clusters within the subtype H3N2, including both well-known key mutations responsible for the antigenic drift of this subtype and previously undiscovered groups. Similarly, 39 clusters were uncovered within the H1N1 subtype. A majority of the identified clusters were associated with known antigenic sites and mutations involving both HA and NA indicating the synergist functions of HA-NA. In addition, emerging and disappearing N-glycosylation sites were also identified which are crucial in post-translational processes influencing protein stability and function (e.g., emergence in amino acid position 339 in NA and disappearance of 187 in HA in A/H3N2), suggesting the importance of HA-NA balance.
Our study offers an alternative approach to the existing mutual-information and phylogenetic methods used to identify co-occurring mutations, enabling faster processing of large amounts of data. Accurately characterizing patterns of mutations, across multiple functional proteins, is critical to provide a better understanding of how viruses evolve and help monitor virus changes to prepare for future outbreaks.