Whole genome technologies, such as high-density genotyping arrays and next-generation sequencing (NGS), can identify sequence variation, particularly single nucleotide polymorphisms (SNPs), of a given individual or species. Current methods, however, are unable to determine the combination of those sequence variants on the same DNA molecule. It is well-understood that determining the combination of sequence variants is called "phase," with the specific combination of sequence variants on the same DNA molecule being called "haplotype." For example, human individuals are diploid, with each somatic cell containing two sets of autosomes that are inherited from each parent. Characterizing the haplotype status of a given individual is important for mapping disease genes, elucidating population histories, and studying the balance of cis and trans-acting variants in phenotypic expression.
There are three general approaches to determine haplotype information: population inference, parental inference, and molecular haplotyping. The most common approach for phasing haplotypes is using inference and statistical methods from data obtained from population or parental genotypes. These computationally-based approaches, along with molecular haplotyping, require a high level of technical expertise and require the creation of large numbers of individual template libraries (on the order of hundreds) in phasing haplotypes of a given biological sample.
Current methods are also hampered by small fragment sizes, which limit the ability to assemble the human genome de novo. Most current technologies require the use of short, fragmented templates and therefore must rely on "mapping" sequence reads to a reference genome. While alignment experiments can capture a significant fraction of single nucleotide polymorphisms, large templates on the order of 10kb to 100kb are needed to resolve a large portion of structural variants and/or to provide the phase of haplotypes across the human genome.
The RedVault lab team has progressed through several key phases in feasibility experiments on a proprietary front-end technology designed to resolve some of the aforementioned issues in the field.