Human T-cell leukemia virus 1 (HTLV-1) is a retrovirus that is endemic in Central Australia with a prevalence of ~ 40% reported in some remote First Nations Communities. HTLV-1c is the divergent molecular variant of HTLV-1 found in Australia. Upon infection, HTLV-1 integrates its reverse transcribed 9 kb proviral DNA genome into the host cell nuclear DNA. The integrated provirus and host genome become enmeshed and have an ongoing reciprocal influence on one another, impacting both host function and viral fitness. The integrated provirus thus has the potential to archive host-virus interactions.
HTLV-1 integration into the human genome has profound consequences on its target cell. However, the consequence of integration on HTLV-1 fitness is less well understood. We have used the Oxford Nanopore Technologies (ONT) long-read sequencing platform to perform the first in-depth, nucleotide resolution characterisation of the continuous HTLV-1c genome, as well as integration site selection and genome structure in a mouse model of HTLV-1c associated disease.
ONT long-read sequencing has enabled an unbiased analysis of integrated provirus DNA for 249 full-length HTLV-1c and the globally distributed HTLV-1a integrants in a native genetic and epigenetic context. The HTLV-1 landscape in the humanised NSG mouse is dominated by defective provirus, including deletion of sequences encoding structural retroviral proteins, and chimeric HTLV-1c-human proviral sequences. We have identified, to our knowledge, the first HTLV-1c–human chimeric proviral genomes, which could encode novel protein products, render the proviral genome defective, and impact both host and viral gene expression.