Seminars and Talks

Advanced Diffusion Models: Accelerated Sampling, Smooth Diffusion, and 3D Shape Generation

by Karsten Kreis

Date:	Thursday, Dec. 22
Time:	17:30
Location:	Online Call via Zoom

Details

Our guest speaker is Karsten Kreis from NVIDIA’s Toronto AI Lab.

You are all cordially invited to the CVG Seminar on the 22nd of December at 5:30 pm CET

via Zoom (passcode is 052316).

Abstract

Denoising diffusion-based generative models have led to multiple breakthroughs in deep generative learning. In this talk, I will discuss recent works by the NVIDIA Toronto AI Lab on diffusion models. In the first part, I will present GENIE: Higher-Order Denoising Diffusion Solvers, a novel method for accelerated sampling from diffusion models, leveraging higher-order methods together with an efficient model distillation technique to solve the generative differential equations of diffusion models. Next, I will discuss our work on Critically-Damped Langevin Diffusion. Taking inspirations from statistical mechanics and Markov chain Monte Carlo, we introduce an auxiliary velocity variable into the diffusion process, which allows the diffusion to converge to the Gaussian prior more smoothly and quickly. This makes critically-damped Langevin diffusion ideally suited for diffusion-based generative modeling. Finally, I will briefly recapitulate Latent Score-based Generative Models and then present LION: Latent Point Diffusion Models for 3D Shape Generation, which achieves state-of-the-art 3D shape synthesis and enables various artistic applications, such as voxel-guided shape generation.

Bio

Karsten Kreis is a senior research scientist at NVIDIA’s Toronto AI Lab. Prior to joining NVIDIA, he worked on deep generative modeling at D-Wave Systems and co-founded Variational AI, a startup utilizing generative models for drug discovery. Before switching to deep learning, Karsten did his M.Sc. in quantum information theory at the Max Planck Institute for the Science of Light and his Ph.D. in computational and statistical physics at the Max Planck Institute for Polymer Research. Currently, Karsten's research focuses on developing novel generative learning methods and on applying deep generative models on problems in areas such as computer vision, graphics and digital artistry, as well as in the natural sciences.

Self-supervised Learning from Images, and Augmentations

by Yuki Asano

Date:	Friday, Dec. 9
Time:	14:30
Location:	Online Call via Zoom

Details

Our guest speaker is Yuki Asano from the University of Amsterdam.

You are all cordially invited to the CVG Seminar on the 9th of December at 2:30 p.m. CET

via Zoom (passcode is 303207).

Abstract

It is a talk about pushing the limits of what can be learnt without using any human annotations. After a first overview of what self-supervised learning is, we will first dive into how clustering can be combined with representation learning using optimal transport and how this can be leveraged to unsupervisedly segment objects in images [1]. Finally, as augmentations are crucial for all of the self-supervised learning, we will analyze these in more detail in a recent preprint [2]. Here, we show that it is possible to extrapolate to semantic classes such as those of ImageNet using just a single datum as visual input when combined with strong augmentations.

[1] Self-Supervised Learning of Object Parts for Semantic Segmentation [arxiv]

[2] Extrapolating from a Single Image to a Thousand Classes using Distillation [arxiv]

Bio

Yuki Asano is an assistant professor for computer vision and machine learning at the Qualcomm-UvA lab at the University of Amsterdam, where he works with Cees Snoek, Max Welling and Efstratios Gavves. His current research interests are multi-modal and self-supervised learning and ethics in computer vision. Prior to his current appointment, he finished his PhD at the Visual Geometry Group (VGG) at the University of Oxford working with Andrea Vedaldi and Christian Rupprecht. During his time as a PhD student, he also interned at Facebook AI Research and worked at TransferWise. Prior to the PhD, he studied physics at the University of Munich (LMU) and Economics in Hagen as well as a MSc in Mathematical Modelling and Scientific Computing at the Mathematical Institute in Oxford.

Machine Learning Based Prediction of Mental Health Using Wearable Measured Time Series

by Seyedeh Sharareh Mirzargar

Date:	Thursday, Oct. 27
Time:	11:00
Location:	N10_302, Institute of Computer Science

Details

You are all cordially invited to the Master Thesis defence on the 27th of October at 11 a.m. CEST

in person at the Institute of Computer Science: room 302, Neubrückstrasse 10, 3012 Bern
via Zoom (passcode is 327699).

Abstract

Depression is the second major cause for years spent in disability and has a growing prevalence in adolescents. The recent Covid-19 pandemic has intensified the situation and limited in-person patient monitoring due to distancing measures. Recent advances in wearable devices have made it possible to record the rest/activity cycle remotely with high precision and in real-world contexts. We aim to use machine learning methods to predict an individual's mental health based on wearable-measured sleep and physical activity. Predicting an impending mental health crisis of an adolescent allows for prompt intervention, detection of depression onset or its recursion, and remote monitoring. To achieve this goal, we train three primary forecasting models; linear regression, random forest, and light gradient boosted machine (LightGBM); and two deep learning models; block recurrent neural network (block RNN) and temporal convolutional network (TCN); on Actigraph measurements to forecast mental health in terms of depression, anxiety, sleepiness, stress, sleep quality, and behavioural problems. Our models achieve a high forecasting performance, the random forest being the winner to reach an accuracy of 98% for forecasting the trait anxiety. We perform extensive experiments to evaluate the models' performance in accuracy, generalization, and feature utilization, using a naive forecaster as the baseline. Our analysis shows minimal mental health changes over two months, making the prediction task easily achievable. Due to these minimal changes in mental health, the models tend to primarily use the historical values of mental health evaluation instead of Actigraph features. At the time of this master thesis, the data acquisition step is still in progress. In future work, we plan to train the models on the complete dataset using a longer forecasting horizon to increase the level of mental health changes and perform transfer learning to compensate for the small dataset size. This interdisciplinary project demonstrates the opportunities and challenges in machine learning-based prediction of mental health, paving the way toward using the same techniques to forecast other mental disorders such as internalizing disorder, Parkinson's disease, Alzheimer's disease, etc. and improving the quality of life for individuals who have some mental disorder.

New Variables of Brain Morphometry: the Potential and Limitations of CNN Regression

by Timo Blattner

Date:	Friday, Sep. 23
Time:	14:30
Location:	N10_302, Institute of Computer Science

Details

You are all cordially invited to the Bachelor Thesis defence on the 23rd of September at 2:30 p.m. CEST

in person at the Institute of Computer Science: room 302, Neubrückstrasse 10, 3012 Bern
via Zoom (passcode is 871438).

Abstract

The calculation of variables of brain morphology is computationally very expensive and time-consuming. Previous work showed the feasibility of extracting the variables directly from T1-weighted brain MRI images using a convolutional neural network. We used significantly more data and extended their model to a new set of neuromorphological variables, which could become interesting biomarkers in the future for the diagnosis of brain diseases. The model shows for nearly all subjects a less than 5% mean relative absolute error. This high relative accuracy can be attributed to the low morphological variance between subjects and the ability of the model to predict the cortical atrophy age trend. The model however fails to capture all the variance in the data and shows large regional differences. We attribute these limitations in part to the moderate to poor reliability of the ground truth generated by FreeSurfer. We further investigated the effects of training data size and model complexity on this regression task and found that the size of the dataset had a significant impact on performance, while deeper models did not perform better. Lack of interpretability and dependence on a silver ground truth are the main drawbacks of this direct regression approach.

Assessment of Movement and Pose in a Hospital Bed by Ambient and Wearable Sensor Technology in Healthy Subjects

by Tony Licata

Date:	Friday, Sep. 9
Time:	14:30
Location:	N10_302, Institute of Computer Science

Details

You are all cordially invited to the Master Thesis defence on the 9th of September at 2:30 p.m. CEST

in person at the Institute of Computer Science: room 302, Neubrückstrasse 10, 3012 Bern
via Zoom (passcode is 678646).

Abstract

The use of automated systems describing human motion has become possible in various domains. Most of the proposed systems are designed to work with people moving around in a standing position. Because such a system could be interesting in a medical environment, we propose in this work a pipeline that can effectively predict human motion from people lying on beds. The proposed pipeline is tested with a data set composed of 41 participants executing 7 predefined tasks in a bed. The motion of the participants is measured with video cameras, accelerometers and a pressure mat. Various experiments are carried out with the information retrieved from the data set. Two approaches combining the data from the different measurement technologies are explored. The performance of the different carried experiments is measured, and the proposed pipeline is composed of components providing the best results. Later on, we show that the proposed pipeline only needs to use video cameras, which makes the proposed environment easier to implement in real-life situations.

begin
2
3
4(current)
5
6
end