Skip to main content


Associate Professor of Computer Science
Associate Professor of Statistics
Courtesy Appointment
LWSN 1207
Phone: 765-496-9370

I am interested in machine learning, computational biology, and more broadly speaking, computational science and engineering. My research involves concepts and methods from multiple disciplines, including Bayesian statistics, algorithm design, genomics, molecular biology, nonlinear optimization, and information theory. Here is the description of some of my research projects.

Approximate Bayesian inference Bayesian inference has become increasingly important in statistical machine learning. It has been successfully applied to a number of applied domains, such as computational biology and computer vision. Exact Bayesian calculations, however, are often computationally infeasible. A focus of my research has been designing efficient, principled approximate inference methods, especially for large-scale problems. (Joint work with T.P. Minka and T.S. Jaakkola)

Computational biology We are interested in identifying genes associated with developmental processes, deciphering regulatory programs at both transcriptional and post-transcriptional levels, and understanding the conservation and reconfiguration of biological networks in development and evolution. As more high-throughput biological data becomes available, it is possible to study these problems in a systematic way, which can yield specific biological knowledge and help us discover general principles governing biological systems. Answers to these problems can lead to important bio-medical applications. (Joint work with D.K. Gifford, T.S. Jaakkola, R. Young, and Ge's labs)

Bayesian conditional random fields Many data sources, such as web pages, images, genomic sequences, and proteins, contain structural relationships among themselves. The task of analyzing these data sources can be effectively formalized as joint classification of multiple elements (e.g., a web page, a pixel, or a nucleotide). Joint classification enables modeling of dependence between elements, allowing structure and context to be taken into account. Conditional random fields (CRF) provide a compelling model for joint classification of structured data. We developed effective Bayesian approaches for training, inference, and feature selection with CRFs. (Joint work with T.P. Minka and M. Szummer)

Nonparametric Bayesian models When we use a flexible model to represent a complicated system, an important question is how to set the model complexity/size given observed data. Nonparametric models can automatically infer model complexity from the data, without explicitly performing Bayesian model selection. We are working on extending classical nonparametric Bayesian approaches to cluster interdependent variables.

Semi-supervised learning Semi-supervised learning uses both labeled and unlabeled data to learn better. For many applications, e.g., bioinformatics, where the labeled data is scarce and obtaining the labeled data is costly, semi-supervised learning provides a valuable tool to obtain new knowledge. We have developed algorithms to learn hyperparameters for semisupervised classifications. (Joint work with A. Kapoor and R.W. Picard) Feature selection In many real-world classification and regression problems, the input consists of a large number of features or variables, only some of which are relevant. Inferring which inputs are relevant is an important problem. We have developed novel Bayesian approaches to determining the relevance of input features. (Joint work with T.P. Minka, R.W. Picard, and Z. Ghahramani)

Applications of machine learning in computer vision, neuroscience, communications and signal processing We have developed probabilistic graphical models and machine learning methods for a variety of applications, ranging from hand-written diagram parsing, to human cortical surface modeling, to wireless signal detection and channel estimation, and to spectrum estimation for unevenly sampled signals. (Joint work with researchers at MIT, Microsoft research and Massachusetts General Hospital)


PhD, MIT Media Lab (2005)

Professional Faculty Research

Machine learning, computational biology, and Bayesian statistics

Purdue University Biological Sciences, 915 W. State Street, West Lafayette, IN 47907

Main Office: (765) 494-4408   Business Office: (765) 494-4764  Contact Us

© 2020 Purdue University | An equal access/equal opportunity university | Copyright Complaints

Trouble with this page? Disability-related accessibility issue? Please contact the College of Science Webmaster.

Maintained by Science IT