I am a VP of Data Engineering at insitro, where we develop data pipelines and infrastructure for machine learning and drug discovery. We are particularly interested in how to express complex scientific pipelines for heterogeneous data types (images, genomics, chemical informatics, etc), while easily tracking and sharing data provenance in a mixed research and production environment.
Previously, I was a VP of Software Engineering at Myriad Genetics working with engineering teams ranging from LIMS, Variant interpretation and reporting, Medical Billing, SRE, and Technical Program Management.
From 2013 to 2018, I was an engineer and later a Director of Software Engineering at Counsyl. Our team focused on various computational challenges in genomics: variant interpretation, disease risk calculations, automated genomic data pipelines, and applications of machine learning.
In 2010, I finished my Ph.D. in Computer Science at MIT. My Ph.D. advisor was Manolis Kellis and I was a part of the Compbio lab. During my Ph.D., I worked on comparative genomics and phylogenomic algorithms for accurately reconstructing gene trees.
A comparative encyclopedia of DNA elements in the mouse genome.
Yue, Cheng, Breschi, Vierstra, Wu, Ryba, Sandstrom, Ma, Davis, Pope, Shen,, Djebali, Thurman, Kaul, Rynes, Kirilusha, Marinov, Williams, Trout, Amrhein, Fisher- Aylor, Antoshechkin, DeSalvo, See, Fastuca, Drenkow, Zaleski, Dobin, Prieto, Lagarde, Bussotti, Tanzer, Denas, Li, Bender, Zhang, Byron, Groudine, McCleary, Pham, Ye, Kuan, Edsall, Wu, Rasmussen, Bansal, Kellis, Keller, Morrissey, Mishra, Jain,, Harris, Cayting, Kawli, Boyle, Euskirchen, Kundaje, Lin, Lin, Jansen, Malladi,, Erickson, Kirkup, Learned, Sloan, Rosenbloom, Sousa, Beal, Pignatelli, Flicek, Lian, Kahveci, Lee, Kent, Santos, Herrero, Notredame, Johnson, Vong, Lee, Bates,, Diegel, Canfield, Sabo, Wilken, Reh, Giste, Shafer, Kutyavin, Haugen, Dunn,, Neph, Humbert, Hansen, Bruijn, Selleri, Rudensky, Josefowicz, Samstein, Eichler, Orkin, Levasseur, Papayannopoulou, Chang, Skoultchi, Gosh, Disteche, Treuting, Wang, Weiss, Blobel, Cao, Zhong, Wang, Good, Lowdon, Adams, Zhou, Pazin,, Wold, Taylor, Mortazavi, Weissman, Stamatoyannopoulos, Snyder, Guigo, Gingeras, Gilbert, Hardison, Beer, Ren, and Mouse ENCODE Consortium.
Phylogenetic Identification and Functional Characterization of Orthologs and Paralogs across Human, Mouse, Fly, and Worm.
Wu, Bansal, Rasmussen, Herrero, Kellis.
Genome-wide inference of ancestral recombination graphs.
Rasmussen, Hubisz, Gronau, Siepel.
PLoS Genetics. 2014. [arXiv] [github] [data] [blog post]
Most parsimonious reconciliation in the presence of gene duplication,
loss, and deep coalescence using labeled coalescent trees.
Wu, Rasmussen, Bansal, Kellis.
Genome Research. 2013. [website]
TreeFix: statistically informed gene tree error correction using species trees
Wu, Rasmussen, Bansal, Kellis.
Systematic Biology. 2012. [website]
Unified modeling of gene duplication, loss, and coalescence using a locus tree.
Genome Research. 2012. [website]
Replacing and additive horizontal gene transfer in Streptococcus.
Choi, Rasmussen, Hubisz, Gronau, Stanhope, Siepel.
Molecular Biology and Evolution. 2012. [website]
Roles of major facilitator superfamily transporters in phosphate response
Bergwitz, Rasmussen, DeRobertis, Wee, Sinha, Chen, Huang, Perrimon.
PLoS One. 2012.
Evolution at the Sub-gene Level: Domain Rearrangements in the Drosophila Phylogeny.
Wu, Rasmussen, Kellis.
Molecular Biology and Evolution. 2011. [website]
A high-resolution map of human evolutionary constraint using 29 mammals
Lindblad-Toh, Garber, Zuk, Lin, Parker, Washietl, Kheradpour, Ernst, Jordan, Mauceli, Ward, Lowe, Holloway, Clamp, Gnerre, Alfoldi, Beal, Chang, Clawson, Cuff, Di Palma, Fitzgerald, Flicek, Guttman, Hubisz, Jaffe, Jungreis, Kent, Kostka, Lara, Martins, Massingham, Moltke, Raney, Rasmussen, Robinson, Stark, Vilella, Wen, Xie, Zody, Broad Institute Sequencing Platform and Whole Genome Assembly Team, Worley, Kovar, Muzny, Gibbs, Baylor College of Medicine Human Genome Sequencing Center, Warren, Mardis, Weinstock, Wilson, Genome Institute at Washington University, Birney, Margulies, Herrero, Green, Haussler, Siepel, Goldman, Pollard, Pedersen, Lander, Kellis.
A Bayesian approach for fast and accurate gene tree reconstruction.
Molecular Biology and Evolution. 2010. [website]
A phylogenomic approach to the evolutionary dynamics of gene
duplication in birds.
Organ*, Rasmussen*, Baldwin, Kellis, and Edwards.
In Evolution After Gene Duplication. (Eds.) K. Dittmar and D. Liberles. Wiley & Sons. 2010.
Evolution of pathogenicity and sexual reproduction in eight Candida genomes.
Butler, Rasmussen, Lin, Santos, Sakthikumar, Munro, Rheinbay, Grabherr, Forche, Reedy, Agrafioti, Arnaud, Bates, Brown, Brunke, Costanzo, Fitzpatrick, de Groot, Harris, Hoyer, Hube, Klis, Kodira, Lennard, Logue, Martin, Neiman, Nikolaou, Quail, Quinn, Santos, Schmitzberger, Sherlock, Shah, Silverstein, Skrzypek, Soll, Staggs, Stansfield, Stumpf, Sudbery, Srikantha, Zeng, Berman, Berriman, Heitman, Gow, Lorenz, Birren, Kellis, and Cuomo.
Nature. 2009. [website]
Performance and scalability of discriminative metrics for comparative gene identification in 12 Drosophila genomes.
Lin, Deoras, Rasmussen, Kellis.
PLoS Computational Biology. 2008.
Accurate gene-tree reconstruction by learning gene- and species-specific
substitution rates across multiple complete genomes.
Genome Research. 2007. [website]
Discovery of functional elements in 12 Drosphila genomes
using evolutionary signatures.
Stark*, Lin*, Kheradpour*, Pederson*, Parts, Carlson, Crosby, Rasmussen, Roy, Deroas, Ruby, Brennecke, FlyBase curators, Berkeley Drosophila Genome Project, Hodges, Hinrichs, Caspi, Paten, Park, Han, Maeder, Polansky, Robson, Aerts, vanHelden, Hassan, Gilbert, Eastman, Rice, Weir, Hahn, Park, Dewey, Pachter, Kent, Haussler, Lai, Bartel, Hannon, Kaufman, Eisen, Clark, Smith, Celniker, Gelbart, Kellis.
Nature. 2007. [website]
Evolution of genes and genomes on the Drosophila phylogeny.
Drosophila 12 Genomes Consortium.
I maintain several open source software projects related (some more than others) to my research. My development interests are in phylogenetic software (DLCoal, SPIMAP, SPIDIR), scientific visualization (SUMMON), and note-taking for research settings (KeepNote).