Publications

During my Ph.D., I conducted research into the combinatorics of genome rearrangements.

As a teaching-track faculty member, I still seek opportunities to publish surrounding computational biology education, whether this is a bestselling textbook in bioinformatics algorithms or articles about experiences in education.

Filter by type:

Sort by year:

Bacterial Computing: Using E. coli to Solve the Burnt Pancake Problem

Journal paper

Laurie J Heyer, Jeffrey L Poet, Marian L Broderick, P Compeau, James O Dickson, W Lance Harden

Math Horizons, Volume 17, Issue 3, Pages 5-10

Publication year: 2010

Genome Reconstruction: A Puzzle with a Billion Pieces

Book Chapter

P Compeau, P Pevzner

Bioinformatics for Biologists

Publication year: 2010

While we can read a book one letter at a time, biologists still lack the ability to read a DNA sequence one nucleotide at a time. Instead, they can identify short fragments (approximately 100 nucleotides long) called reads; however, they do not know where these reads are located within the genome. Thus, assembling a genome from reads is like putting together a giant puzzle with a billion pieces, a formidable mathematical problem. We introduce some of the fascinating history underlying both the mathematical and the biological sides of DNA sequencing.

Girth of pancake graphs

Journal paper

P Compeau

Discrete applied mathematics, volume 159, issue 15, pages 1641-1645

Publication year: 2011

We consider four families of pancake graphs, which are Cayley graphs, whose vertex sets are either the symmetric group on n objects or the hyperoctahedral group on n objects and whose generating sets are either all reversals or all reversals inverting the first k elements (called prefix reversals). We find that the girth of each family of pancake graphs remains constant after some small threshold value of n.

A simplified view of DCJ-Indel distance

Conference paper

P Compeau

International Workshop on Algorithms in Bioinformatics, pages 365-377

Publication year: 2012

The introduction of the double cut and join (DCJ) operation and the derivation of its associated distance caused a flurry of research into the study of multichromosomal rearrangements. However, little of this work has incorporated indels (i.e., insertions and deletions) into the calculation of genomic distance functions, with a particular exception of Braga et al., who provided a linear time algorithm ([1]) for computing the DCJ-indel distance. Although this algorithm only takes linear time, its derivation is lengthy and depends on a large number of possible cases. In this paper, we provide a simplified indel model that solves the problem of DCJ-indel sorting in linear time directly from the classical breakpoint graph, an approach that allows us to describe the solution space of DCJ-indel sorting, thus resolving an existing open problem.

DCJ-Indel Sorting Revisited

Journal paper

P Compeau

Algorithms for Molecular Biology, Volume 8, Issue 1

Publication year: 2013

Background

The introduction of the double cut and join operation (DCJ) caused a flurry of research into the study of multichromosomal rearrangements. However, little of this work has incorporated indels (i.e., insertions and deletions of chromosomes and chromosomal intervals) into the calculation of genomic distance functions, with the exception of Braga et al., who provided a linear time algorithm for the problem of DCJ-indel sorting. Although their algorithm only takes linear time, its derivation is lengthy and depends on a large number of possible cases.

Results

We note the simple idea that a deletion of a chromosomal interval can be viewed as a DCJ that creates a new circular chromosome. This framework will allow us to amortize indels as DCJs, which in turn permits the application of the classical breakpoint graph to obtain a simplified indel model that still solves the problem of DCJ-indel sorting in linear time via a more concise formulation that relies on the simpler problem of DCJ sorting. Furthermore, we can extend this result to fully characterize the solution space of DCJ-indel sorting.

Conclusions

Encoding indels as DCJ operations offers a new insight into why the problem of DCJ-indel sorting is not ultimately any more difficult than that of sorting by DCJs alone. There is still room for research in this area, most notably the problem of sorting when the cost of indels is allowed to vary with respect to the cost of a DCJ and we demand a minimum cost transformation of one genome into another.

A generalized cost model for DCJ-indel sorting

Conference paper

P Compeau

International Workshop on Algorithms in Bioinformatics, pages 38-51

Publication year: 2014

The double-cut-and-join operation (DCJ) is a fundamental graph operation that is used to model a variety of genome rearrangements. However, DCJs are only useful when comparing genomes with equal (or nearly equal) gene content. One obvious extension of the DCJ framework supplements DCJs with insertions and deletions of chromosomes and chromosomal intervals, which implies a model in which DCJs receive unit cost, whereas insertions and deletions receive a nonnegative cost of ω. This paper proposes a unified model finding a minimum-cost transformation of one genome (with circular chromosomes) into another genome for any value of ω. In the process, it resolves the open case ω > 1.

Life After MOOCs

Journal paper

Communications of the ACM, Volume 58, Issue 10, pages 41-44

Publication year: 2015

Online science education needs a new revolution.

Establishing a computational biology flipped classroom

Journal paper

P Compeau

PLoS computational biology 15 (5)

Publication year: 2019

In a flipped classroom, students complete automated modules to replace a traditional lecture, allowing the time devoted for the lecture to be devoted to constructive tasks reinforcing student knowledge. Yet although the flipped classroom offers a compelling approach for fostering a constructivist, student-centric learning environment, research on the efficacy of flipped classes has been mixed. For that matter, is it possible to successfully flip a classroom in an advanced, heavily specialized course like a bioinformatics algorithms course? Over the past several years, the author has implemented a flipped version of such a course and will discuss some of the successes and pitfalls encountered.

Bioinformatics Algorithms: An Active Learning Approach (3rd Edition)

Textbook

P Compeau, P Pevzner

Publication year: 2018

This textbook has been adopted by over 100 institutions around the world. An interactive version powers the authors’ popular online courses on Coursera.

How to apply de Bruijn graphs to genome assembly

Journal paper

P Compeau, PA Pevzner, GP Tesler

Nature biotechnology

Publication year: 2011

Much of the fundamental work on the problem of assembling genomes from short reads is based upon a structure called a de Bruijn graph and finding an Eulerian path in this graph. In this review paper, we describe this approach and detail how it has been applied to genome assembly algorithms.

Phillip Compeau, Ph.D.

Teaching Professor
Asst Dean for Innovation in Computing Education

Computational Biology Department

Carnegie Mellon University

Publications

Filter by type:

Bacterial Computing: Using E. coli to Solve the Burnt Pancake Problem

Genome Reconstruction: A Puzzle with a Billion Pieces

Girth of pancake graphs

A simplified view of DCJ-Indel distance

DCJ-Indel Sorting Revisited

Background

Results

Conclusions

A generalized cost model for DCJ-indel sorting

Life After MOOCs

Establishing a computational biology flipped classroom

Bioinformatics Algorithms: An Active Learning Approach (3rd Edition)

How to apply de Bruijn graphs to genome assembly

Phillip Compeau, Ph.D.

Teaching ProfessorAsst Dean for Innovation in Computing EducationComputational Biology DepartmentCarnegie Mellon University

Publications

Filter by type:

Bacterial Computing: Using E. coli to Solve the Burnt Pancake Problem

Genome Reconstruction: A Puzzle with a Billion Pieces

Girth of pancake graphs

A simplified view of DCJ-Indel distance

DCJ-Indel Sorting Revisited

Background

Results

Conclusions

A generalized cost model for DCJ-indel sorting

Life After MOOCs

Establishing a computational biology flipped classroom

Bioinformatics Algorithms: An Active Learning Approach (3rd Edition)

How to apply de Bruijn graphs to genome assembly

Teaching Professor
Asst Dean for Innovation in Computing Education

Computational Biology Department

Carnegie Mellon University