All talks

Seminars and Talks

Simpler, Faster, Stronger: Supercharging Contrastive Learning with Novel Mutual Information Bounds
by Chenyang Tao
Date: Friday, Oct. 1
Time: 15:00
Location: Online Call via Zoom

Our guest speaker is Professor Chenyang Tao, from the Amazon research (and before that Duke University) and you are all cordially invited to the CVG Seminar on Oct 1st at 3:00 p.m. CEST on Zoom (passcode is 145086).


Contrastive representation learners, such as MoCo and SimCLR, have been tremendously successful in recent years. These successes are empowered by the recent advances in contrastive variational mutual information (MI) estimators, in particular, InfoNCE and its variants. A major drawback with InfoNCE-type estimators is that they not only crucially depend on costly large-batch training but also sacrifice bound tightness for variance reduction. In this talk, we show how insights from unnormalized statistical modeling and convex optimization can overcome these limitations. We present a novel, simple, and powerful contrastive MI estimation framework based on Fenchel-Legendre Optimization (FLO). The derived novel FLO estimator is theoretically tight and it provably converges under stochastic gradient descent. We show how an important variant, named FlatNCE, overcomes the notorious log-K curse suffered by InfoNCE and excels at self-supervised learning, with just a one-line change of code. We also introduce new tools to monitor and diagnose contrastive training and demonstrate extended applications in fields such as causal machine learning. We will conclude the talk with some ideas for future work. 


Chenyang Tao is a senior research associate affiliated with Duke Statistics and Electrical & Computer Engineering, where he leads a small research group of Ph.D. students and postdocs with multi-disciplinary backgrounds working on both fundamental AI research and applications. Tao obtained his Ph.D. in Applied Mathematics from Fudan University (2011-2016) and was a visiting scholar at the University of Warwick (2014-2016) and a visiting scientist at RTI International (2017-2019). He joined Duke in 2017 and has worked enthusiastically to develop novel machine learning algorithms using new theoretical insights, which is endorsed by a strong publication record in top-tier machine learning conferences (more than 10 papers in ICML, NeurIPS, ICLR, etc.) He dived deep into the following three topics: (i) probabilistic inference (e.g., variational Bayesian, adversarial learning, optimal transport, energy models, etc.); (ii) representation optimization (e.g., contrastive learning, mutual information estimation, fairness, self-supervised learning, etc.); (iii) causal machine learning (e.g., counterfactual reasoning, causal representation transfer, etc.). The techniques he developed have been proven widely useful for various applications such as NLP, imbalanced data learning, time-to-event analysis, etc. Outside work, he likes hiking, camping, fishing, kayaking, and binge watching sci-fi franchises. ^_^