It has been over 30 years since the emergence of HIV/AIDS, yet the disease continues to kill over one million people worldwide per year [UNAIDS report]. One of the reasons that this epidemic has been so difficult to control is because HIV evolves quickly—it has a short replication time and a high mutation rate, so viruses harboring new mutations that confer drug resistance tend to arise often and spread quickly.
However, the likelihood of one of these beneficial mutations popping up and subsequently “sweeping” through the viral population—i.e., becoming more common because of the survival advantage—also depends on the underlying population genetics, much of which is still poorly understood. In a paper just published in PLoS Genetics, Pleuni Pennings, postdoc in the Petrov lab, and colleagues Sergey Kryazhimskiy and John Wakeley from Harvard tracked the genetic diversity in adapting populations of HIV to better understand how and when new mutations arise.
Mutations and populations
Mutations are usually caused by either DNA damage (e.g., from environmental factors like UV radiation) or by a mistake during DNA replication. Because HIV is a retrovirus, meaning it must copy its RNA genome into DNA before it can be reproduced in the host cell, it is especially prone to errors that happen during the replication process. The rate that these errors occur, also called the mutation rate, is constant on a per-virus basis —for example, a specific mutation might happen in one virus in a million. As a consequence, the overall number of viruses in the population determines how many new mutations will be present, with a larger population harboring more mutations at any given time.
Whether these mutations will survive, however, is related to what population geneticists call the “effective population size” (also known as Ne), which takes into account genetic diversity. Due to a combination of factors, including the purely random destruction of some viruses, not all mutations will be preserved in the population, regardless of how beneficial they are. The Ne is a purely theoretical measure that can tell us how easily and quickly a new mutation can spread throughout a population. Because it accounts for factors that affect diversity, it is usually smaller than the actual (or “census”) population size.
Pennings and colleagues wanted to determine the Ne for HIV in a typical patient undergoing drug treatment. This is a contentious area: previous researchers examining this question using different methods, including simply summing up overall mutation numbers, came up with estimates of Ne ranging from one thousand to one million (in contrast, the actual number of virus-producing cells in the body is closer to one hundred million, but more on that later). To get a more exact estimate, Pennings took a new approach. Using previously published DNA sequences of HIV sampled from patients over the course of a drug treatment regimen, she looked at the actual dynamics of the development of drug-resistant virus populations over time.
Specifically, Pennings focused on selective sweeps, wherein an advantageous mutation appears and then rises in frequency in the population. Features of these sweeps can give estimates of Ne because they reveal information about the diversity present in the initial population. Pennings sought to distinguish between “hard” and “soft” selective sweeps occurring as the viruses became drug resistant. A hard sweep occurs when a mutation appears in one virus and then rises in frequency, whereas a soft sweep happens when multiple viruses independently gain different mutations, which again rise in frequency over time (see Figure 1). These two types of sweeps have distinct fingerprints, and their relative frequencies depend on the underlying effective population size—soft sweeps are more likely when a population is larger it becomes more likely for different beneficial mutations to independently arise in two different viruses. Soft sweeps also leave more diversity in the adapted population compared to hard sweeps (Figure 1).
To tell these types of sweeps apart, Pennings took advantage of a specific amino acid change in the HIV gene that encodes reverse transcriptase (RT). This change can result from two different nucleotide changes, either one of which will change the amino acid from lysine to asparagine and confer resistance to drugs that target the RT protein. Pennings used this handy feature to identify hard and soft sweeps: if she observed both mutations in the same drug-resistant population, then the sweep was soft. If only one mutation was observed, the sweep could be soft or hard, so she also factored in diversity levels to tell these apart. Pennings found evidence of both hard and soft sweeps in her study populations. Based on the frequencies of each, she estimated the Ne of HIV in the patients. Her estimate was 150,000, which is higher than some previous estimates but still lower than the actual number of virus-infected cells in the body. Pennings suggests that this discrepancy could be due to the background effects of other mutations in the viruses that gain the drug-resistance mutation—that is, even if a virus gets the valuable resistance mutation, it might still end up disappearing from the population because it happened to harbor some other damaging mutation as well. This would reduce the effective population size as measured by selective sweeps.
Implications and future work
Pennings’ findings have several implications. The first is that HIV populations have a limited supply of resistance mutations, as evidenced by the presence of hard sweeps (which, remember, occur when a sweep starts from a single mutation). This means that even small reductions in Ne, such as those produced by combination drug therapies, could have a big impact on preventing drug resistance. The second relates to the fact that, as described above, the likelihood that a mutation will sweep the population may be affected by background mutations in the virus in which it appears. This finding suggests that mutagenic drugs, given in combination with standard antiretrovirals, could be particularly useful for reducing drug resistance. Now, Pennings is using larger datasets to determine whether some types of drugs lead to fewer soft sweeps (presumably because they reduce Ne). She is also trying to understand why drug resistance in HIV evolves in a stepwise fashion (one mutation at a time), even if three drugs are used in combination.