Efficient Modeling and Inference in Population Genetics
DNA sequences from a sample of present day individuals is a record of the evolutionary history of the population. Availability of molecular sequence data from different organisms living today and from ancient DNA samples has enabled reconstruction of past population size trajectories of human populations over the past 150,000 years, the 2014 Ebola virus epidemic in Sierra Leone, and the Hepatitis C virus epidemic in Egypt. However, current statistical methods for the reconstruction of the population size history are hampered by the computational complexity of simulating and analyzing the genealogies of large samples. This project explores the use of theoretically-grounded optimized models of ancestral relationships that exploits the most informative aspects of a large dataset of molecular sequences. The statistical methods developed will be implemented in publicly available software ensuring fast dissemination of our methodology among practitioners.