Interdisciplinary Initiatives Program Round 7 - 2014

Gill Bejerano, Developmental Biology
Bill Dally, Electrical Engineering, Computer Science

Two genomic loci are called homologous when they share a common evolutionary origin. Like species taxonomy, sequence homology is a fundamental tenant of molecular biology. Homologous loci often perform similar functions, and thus much can be learned by observing and probing what they share in common, as well as any differencees. Sequence homology is discovered via sequence search. When two sufficiently long DNA substrings are much more similar to each other than any comparable unrelated sequences are, they are declared as homologous. Homology detection is fundamental to many important questions in biomedicine, developmental and evolutionary developmental biology and genome evolution. The vast majority of homology sequence searches are currently performed using low sensitivity heuristic algorithms, especially in the important context of gene regulation. This algorithm choice was arrived at over twenty years ago, when more powerful search algorithms were (correctly) deemed far too slow to run on then existing computer architecture.

Exciting developments in computer architecture, particularly in field programmable gate arrays (or FPGA), now allow the acceleration of sequence searches by a thousand fold. FPGA technology lets us, for the first time, unelash much more powerful homology search algorithms genome-wide. A key reason this has not yet happened is that FPGA programming is complex. It requires hardware expertise rarely found in the field of computational biology. Our proposal teams up a computer hardware lab with a computational/experimental genomics lab to bridge this gap. Our goals are to build a flexible sequence search FPGA based platform that will (through user friendly interfaces) appeal to both computational and experimental biologist. Moreover, we will perform a first high impact biological study using this new platform, making sure the impact of our work extends into both computer architecture and biomedicine. We expect our proposal to yield valuable new insights into the human genome, and to demonstrate a collaboration that liberates end user communities to wield powerful computer architectures. Our work will also provide a timely reminder to both communities of the great bounties that lie at the intersection of computer architecture and the exploding field of genomics.