Current:
Advances in machine learning over the past decade, particularly hardware developments enabling the effective implementation of deep neural networks, have driven recent state-of-the-art results on benchmark tasks in computer vision and speech recognition. Combined with the availability of increasingly large datasets, these technical advances have facilitated the successful application of deep neural networks to diverse applications, including identifying exotic particles in high-energy physics experiments, predicting the chemical properties of small molecules, and predicting protein structure from amino acid sequence. My current work focuses on understanding deep neural networks, and on applying machine learning techniques to tasks in the natural sciences.
Publications:
ClusterCAD: a computational platform for type I modular polyketide synthase design.
C.H. Eng, T.W.H. Backman, C.B. Bailey, C. Magnan, H.G. Martin, L. Katz, P. Baldi, and J.D. Keasling.
Nucleic Acids Research, 2017.
[article]
[abstract]
ClusterCAD is a web-based toolkit designed to leverage the collinear structure and deterministic logic of type I modular polyketide synthases (PKSs) for synthetic biology applications. The unique organization of these megasynthases, combined with the diversity of their catalytic domain building blocks, has fueled an interest in harnessing the biosynthetic potential of PKSs for the microbial production of both novel natural product analogs and industrially relevant small molecules. However, a limited theoretical understanding of the determinants of PKS fold and function poses a substantial barrier to the design of active variants, and identifying strategies to reliably construct functional PKS chimeras remains an active area of research. In this work, we formalize a paradigm for the design of PKS chimeras and introduce ClusterCAD as a computational platform to streamline and simplify the process of designing experiments to test strategies for engineering PKS variants. ClusterCAD provides chemical structures with stereochemistry for the intermediates generated by each PKS module, as well as sequence- and structure-based search tools that allow users to identify modules based either on amino acid sequence or on the chemical structure of the cognate polyketide intermediate. ClusterCAD can be accessed at https://clustercad.jbei.org and at http://clustercad.igb.uci.edu.
Jet substructure classification in high-energy physics with deep neural networks.
P. Baldi, K. Bauer, C. Eng, P. Sadowski, D. Whiteson.
Physical Review D, 2016.
[article]
[arxiv]
[abstract]
At the extreme energies of the Large Hadron Collider, massive particles can be produced at such high velocities that their hadronic decays are collimated and the resulting jets overlap. Deducing whether the substructure of an observed jet is due to a low-mass single particle or due to multiple decay objects of a massive particle is an important problem in the analysis of collider data. Traditional approaches have relied on expert features designed to detect energy deposition patterns in the calorimeter, but the complexity of the data make this task an excellent candidate for the application of machine learning tools. The data collected by the detector can be treated as a two-dimensional image, lending itself to the natural application of image classification techniques. In this work, we apply deep neural networks with a mixture of locally connected and fully connected nodes. Our experiments demonstrate that without the aid of expert features, such networks match or modestly outperform the current state-of-the-art approach for discriminating between jets from single hadronic particles and overlapping jets from pairs of collimated hadronic particles, and that such performance gains persist in the presence of pileup interactions.
Alteration of Polyketide Stereochemistry from anti to syn by a Ketoreductase Domain Exchange in a Type I Modular Polyketide Synthase Subunit.
C.H. Eng, S. Yuzawa, G. Wang, E.E.K. Baidoo, L. Katz, and J.D. Keasling.
Biochemistry, 2016.
[article]
[abstract]
Polyketide natural products have broad applications in medicine. Exploiting the modular nature of polyketide synthases to alter stereospecificity is an attractive strategy for obtaining natural product analogues with altered pharmaceutical properties. We demonstrate that by retaining a dimerization element present in LipPks1+TE, we are able to use a ketoreductase domain exchange to alter α-methyl group stereochemistry with unprecedented retention of activity and simultaneously achieve a novel alteration of polyketide product stereochemistry from anti to syn. The substrate promiscuity of LipPks1+TE further provided a unique opportunity to investigate the substrate dependence of ketoreductase activity in a polyketide synthase module context.
Enzyme analysis of the polyketide synthase leads to the discovery of a novel analog of the antibiotic α-lipomycin.
S. Yuzawa, C.H. Eng, L. Katz, and J.D. Keasling.
Journal of Antibiotics, 2014.
[article]
Broad Substrate Specificity of the Loading Didomain of the Lipomycin Polyketide Synthase.
S. Yuzawa, C.H. Eng, L. Katz, and J.D. Keasling.
Biochemistry, 2013.
[article]
[abstract]
LipPks1, a polyketide synthase subunit of the lipomycin synthase, is believed to catalyze the polyketide chain initiation reaction using isobutyryl-CoA as a substrate, followed by an elongation reaction with methylmalonyl-CoA to start the biosynthesis of antibiotic α-lipomycin in Streptomyces aureofaciens Tü117. Recombinant LipPks1, containing the thioesterase domain from the 6-deoxyerythronolide B synthase, was produced in Escherichia coli, and its substrate specificity was investigated in vitro. Surprisingly, several different acyl-CoAs, including isobutyryl-CoA, were accepted as the starter substrates, while no product was observed with acetyl-CoA. These results demonstrate the broad substrate specificity of LipPks1 and may be applied to producing new antibiotics.