Seminars and Talks

Simpler, Faster, Stronger: Supercharging Contrastive Learning with Novel Mutual Information Bounds
by Chenyang Tao
Date: Friday, Oct. 1
Time: 15:00
Location: Online Call via Zoom

Our guest speaker is Professor Chenyang Tao, from the Amazon research (and before that Duke University) and you are all cordially invited to the CVG Seminar on Oct 1st at 3:00 p.m. CEST on Zoom (passcode is 145086).

Abstract

Contrastive representation learners, such as MoCo and SimCLR, have been tremendously successful in recent years. These successes are empowered by the recent advances in contrastive variational mutual information (MI) estimators, in particular, InfoNCE and its variants. A major drawback with InfoNCE-type estimators is that they not only crucially depend on costly large-batch training but also sacrifice bound tightness for variance reduction. In this talk, we show how insights from unnormalized statistical modeling and convex optimization can overcome these limitations. We present a novel, simple, and powerful contrastive MI estimation framework based on Fenchel-Legendre Optimization (FLO). The derived novel FLO estimator is theoretically tight and it provably converges under stochastic gradient descent. We show how an important variant, named FlatNCE, overcomes the notorious log-K curse suffered by InfoNCE and excels at self-supervised learning, with just a one-line change of code. We also introduce new tools to monitor and diagnose contrastive training and demonstrate extended applications in fields such as causal machine learning. We will conclude the talk with some ideas for future work. 

Bio

Chenyang Tao is a senior research associate affiliated with Duke Statistics and Electrical & Computer Engineering, where he leads a small research group of Ph.D. students and postdocs with multi-disciplinary backgrounds working on both fundamental AI research and applications. Tao obtained his Ph.D. in Applied Mathematics from Fudan University (2011-2016) and was a visiting scholar at the University of Warwick (2014-2016) and a visiting scientist at RTI International (2017-2019). He joined Duke in 2017 and has worked enthusiastically to develop novel machine learning algorithms using new theoretical insights, which is endorsed by a strong publication record in top-tier machine learning conferences (more than 10 papers in ICML, NeurIPS, ICLR, etc.) He dived deep into the following three topics: (i) probabilistic inference (e.g., variational Bayesian, adversarial learning, optimal transport, energy models, etc.); (ii) representation optimization (e.g., contrastive learning, mutual information estimation, fairness, self-supervised learning, etc.); (iii) causal machine learning (e.g., counterfactual reasoning, causal representation transfer, etc.). The techniques he developed have been proven widely useful for various applications such as NLP, imbalanced data learning, time-to-event analysis, etc. Outside work, he likes hiking, camping, fishing, kayaking, and binge watching sci-fi franchises. ^_^
 

Light Field Networks
by Vincent Sitzmann
Date: Thursday, Jul. 22
Time: 16:00
Location: Online Call via Zoom

Our guest speaker is Vincent Sitzmann from the MIT CSAIL and you are all cordially invited to the CVG Seminar on July 22nd at 4:00 p.m. CET on Zoom (passcode is 566141), where‪ Vincent will give a talk titled “Light Field Networks“.

Abstract

Given only a single picture, people are capable of inferring a mental representation that encodes rich information about the underlying 3D scene. We acquire this skill not through massive labeled datasets of 3D scenes, but through self-supervised observation and interaction. Building machines that can infer similarly rich neural scene representations is critical if they are to one day parallel people’s ability to understand, navigate, and interact with their surroundings. This poses a unique set of challenges that sets neural scene representations apart from conventional representations of 3D scenes: Rendering and processing operations need to be differentiable, and the type of information they encode is unknown a priori, requiring them to be extraordinarily flexible. At the same time, training them without ground-truth 3D supervision is a highly underdetermined problem, highlighting the need for structure and inductive biases without which models converge to spurious explanations. 
Focusing on 3D structure, a fundamental feature of natural scenes, I will demonstrate how we can equip neural networks with inductive biases that enables them to learn 3D geometry, appearance, and even semantic information, self-supervised only from posed images. I will then discuss our recent work overcoming a key limitation of existing 3D-structured neural scene representations, the differentiable ray-marching, by directly parameterizing the 360 degree, 4D light field of 3D scenes.

Bio

Vincent Sitzmann is a Postdoc with Joshua Tenenbaum, William Freeman, and Fredo Durand at MIT CSAIL, and an incoming Assistant Professor, also at MIT. Previously, he finished his PhD at Stanford University. His primary research interests lie in the self-supervised learning of neural representations of 3D scenes, and their applications in computer graphics, computer vision, and robotics.

Generative Modeling by Estimating Gradients of the Data Distribution
by Yang Song
Date: Thursday, Jun. 24
Time: 17:30
Location: Online Call via Zoom

Our guest speaker is Yang Song from the University of Stanford and you are all cordially invited to the CVG Seminar on June 24th at 5:30 p.m. CET on Zoom (passcode is 299064), where‪ Yang will give a talk titled “Generative Modeling by Estimating Gradients of the Data Distribution“.

Abstract

Existing generative methods are typically based on training explicit probability representations with maximum likelihood (e.g., VAEs), or learning implicit sampling procedures with adversarial training (e.g., GANs). The former requires variational inference or special model architectures for tractable training, while the latter can be unstable. To address these difficulties, we explore an alternative approach based on estimating gradients of probability densities. We can estimate gradients of distributions by training flexible neural network models with denoising score matching, and use these models for sample generation, exact likelihood computation, posterior inference, and data manipulation by leveraging techniques of MCMC and stochastic differential equations. Our framework enables free-form model architectures, requires no adversarial optimization, and achieves the state-of-the-art performance in many applications such as image and audio generation.

Bio

Yang Song is a fifth-year PhD student in Computer Science at Stanford University, advised by Stefano Ermon. His research focuses on deep generative models, with applications in robust machine learning and inverse problems. He is a recipient of the inaugural Apple PhD Fellowship in AI/ML and J.P. Morgan PhD Fellowship. His research in score-based generative models has been recognized in NeurIPS 2019 (Oral) and ICLR 2021 (Outstanding Paper Award).
 

Causality and Distribution Generalization
by Professor Jonas Peters
Date: Thursday, May. 27
Time: 14:30
Location: Online Call via Zoom

Our guest speaker is Professor Jonas Peters from the department of mathematical sciences at the University of Copenhagen and you are all cordially invited to the CVG Seminar on May 27th at 2:30 p.m. CET on Zoom (passcode is 486210), where‪ Jonas will give a talk titled “Causality and Distribution Generalization“.

Abstract

Purely predictive methods do not perform well when the test distribution changes too much from the training distribution. Causal models are known to be stable with respect to distributional shifts such as arbitrarily strong interventions on the covariates but do not perform well when the test distribution differs only mildly from the training distribution. We discuss methods such as Anchor Regression, Stabilized Regression, and CausalKinetiX that trade-off between causal and predictive models to obtain favorable generalization properties. We discuss possible extensions to nonlinear models and the theoretical limitations of such methodology.

Bio

Jonas is a professor in statistics at the Department of Mathematical Sciences at the University of Copenhagen. Previously, he has been a group leader at the Max-Planck-Institute for Intelligent Systems in Tuebingen and a Marie Curie fellow at the Seminar for Statistics, ETH Zurich. He studied Mathematics at the University of Heidelberg and the University of Cambridge and obtained his Ph.D. jointly from MPI and ETH. He is interested in inferring causal relationships from different types of data and in building statistical methods that are robust with respect to distributional shifts. In his research, Jonas seeks to combine theory, methodology, and applications. His work relates to areas such as computational statistics, causal inference, graphical models, independence testing, or high-dimensional statistics.

Uncovering the Intrinsic Structures: Representation Learning and Its Applications
by Dr. Shuai Zhang
Date: Friday, Apr. 30
Time: 14:30
Location: Online Call via Zoom

Our guest speaker is Dr. Shuai Zhang from the department of computer science at ETH Zurich and you are all cordially invited to the CVG Seminar on April 30th at 2:30 p.m. CET on Zoom (passcode is 765585), where‪ Shuai will give a talk titled “Uncovering the Intrinsic Structures: Representation Learning and Its Applications“.

Abstract

Learning suitable data representation lives at the heart of many intelligent applications. The quality of the learned representations is determined by how well the model uncovers the intrinsic structures of data.
In this talk, I will first describe our recent work on geometry-oriented representation learning and demonstrate how applications that heavily rely on representation learning can benefit from it. In particular, I will present a data-driven approach, switch space, a novel way of combining spherical, euclidean, and hyperbolic spaces in a single model with specialization. Using switch spaces, we obtain state-of-the-art performances on knowledge graph completion and recommender systems.  Then, I will introduce our ICLR 2021 work on learning representations in hypercomplex space, including the parameterized hypercomplex multiplication layer and its applications on LSTM and Transformer.

Bio

Shuai Zhang is a postdoctoral researcher in the department of computer science at ETH Zurich, where he works with Prof. Ce Zhang. He received his Ph.D. in computer science from the University of New South Wales, under the supervision of Prof. Lina Yao.  His current research lies in geometry-oriented representation learning and its applications in information filtering, knowledge graph completion, and reasoning. He is a recipient of the outstanding paper award at ICLR 2021 and the best paper award runner-up at WSDM 2020.