Seminars and Talks

Computer Vision Group Seminar
Date: Thursday, Oct. 24
Time: 13:00
Location: 2nd floor, room 210

In this weekly meeting the CVG members come together and discuss recent topics in the Computer Vision and Machine Learning community. In addition there are typically two presentations of selected papers or student projects. 


InteractiveDeepSleepNet: An interactive automatic sleep scoring system using raw single-channel EEG
by Luigi Fiorillo
Date: Thursday, Sep. 26
Time: 13:45
Location: Seminar room 306, Neubrückstrasse 10

Our newest member of the group, Luigi Fiorillo, will introduce his current research titled InteractiveDeepSleepNet: An interactive automatic sleep scoring system using raw single-channel EEG in our next lab meeting. Luigi joins our lab as an external PhD student co-supervised by Prof. Paolo Favaro and Prof. Dr. Francesca Faraci. Please find the abstract of his talk below.


Clinical sleep scoring involves a tedious visual review of overnight polysomnograms by a human expert, according to official standards. An automatic sleep scoring results in a simple classification problems which aims to predict, for each 30-seconds epoch of polysomnography, the correct sleep stage label. It could appear then a suitable task for artificial intelligence algorithms. Indeed, machine learning (ML) algorithms have been applied to sleep scoring for many years. As a result, several software products offer nowadays automated or semi-automated scoring services. However, the vast majority of the sleep physicians do not use them. Very recently, thanks to the increased computational power, deep learning (DL) has also been employed with promising results. ML and DL algorithms can undoubtedly reach a high accuracy in specific situations, but there are many difficulties in their introduction in the daily routine.  We believe that the reason why ML and DL scoring system continues not to be integrated in the hospital routine is because the knowledge of the physician fails to be integrated in the scoring process. A user-centric approach including physicians in the learning phase, directly interacting with the algorithm, could be successful in sleep scoring. Combination of a deep learning architecture, using raw single-channel EEG data with a human-centered approach has the potential of leading to a well accepted software.

Bsc Thesis: StitchNet - Image Stitching using Autoencoders and Deep Convolutional Neural Networks
by Maurice Rupp
Date: Thursday, Sep. 26
Time: 13:00
Location: Seminar room 306, Neubrückstrasse 10


Until now, the task of  stitching multiple overlapping images to a bigger, panoramic picture is solely approached with "classical", hardcoded algorithms while deep learning is at most used for speci c subtasks. This talk introduces a novel end-to-end neural network approach to image stitching called StitchNet, which uses a (pretrained) autoencoder and deep convolutional networks. Additionally to presenting several new datasets for the task of supervised image stitching with each 120'000 training and 5'000 validation samples, this talk also presents various experiments with different kinds of existing networks designed for image superresolution and image segmentation adapted to the task of image stitching.

Seminar Talk: Dynamic Scene Deblurring
by Seungjun Nah
Date: Thursday, Sep. 19
Time: 14:00
Location: Room 302, Neubrückstrasse 10

Motion blur is one of the most common artifacts in photographs and videos. Handshaken mobile cameras and motions of objects occurring during the exposure are the main cause of the blur. While sharp scenes can be captured from fast shutter speed, an aligned pair of blurry and sharp images are hard to be captured at the same time. To enable supervised learning for deblurring, we propose a way to synthesize dynamic motion blurred images from high speed cameras to construct large-scale dataset. We show deep neural networks trained on this data generalizes to real blurry images and videos. Finally, we present a high-quality REDS dataset for video deblurring and super-resolution. The REDS dataset is in high-quality in terms of the reference frames and the realism of quality degradation. The REDS dataset was employed in the NTIRE 2019 challenges on video deblurring and super-resolution.
About Seungjun Nah

Seungjun Nah is a Ph. D. student at Seoul National University, advised by Prof. Kyoung Mu Lee. He received his BS degree from Seoul National University in 2014. He has worked on computer vision research topics including deblurring, super-resolution, and neural network acceleration. He won the 1st place award from NTIRE 2017 super-resolution challenge and workshop. He co-organized the NTIRE 2019  and AIM 2019 workshops and challenges on video quality restoration. He has reviewed conference (ICCV 2019, CVPR 2018, SIGGRAPH Asia 2018) and journal (IJCV, TNNLS, TMM, TIP) paper submissions. He is one of the best reviewers in ICCV 2019. His research interests include visual quality enhancement, low-level computer vision, and efficient deep learning. He is currently a guest scientist at Max Planck Institute for Intelligent Systems.


[1] Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee, "Deep Multi-scale Convolutional Neural Network for Dynamic Scene Deblurring," CVPR 2017

[2] Seungjun Nah, Sanghyun Son, and Kyoung Mu Lee, “Recurrent Neural Networks with Intra-Frame Iterations for Video Deblurring,” CVPR 2019

[3] Seungjun Nah, Sungyong Baik, Seokil Hong, Gyeongsik Moon, Sanghyun Son, Radu Timofte, and Kyoung Mu Lee, “NTIRE 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study,” CVPRW 2019

BSc Thesis: A Study of 3D Reconstruction of Varying Objects with Deformable Parts Models
by Raoul Grossenbacher
Date: Thursday, Sep. 19
Time: 10:15
Location: Seminar room 302, Neubrückstrasse 10


This work covers a new approach to 3D reconstruction. In traditional 3D reconstruction one uses multiple images of the same object to calculate a 3D model by taking information gained from the differences between the images, like camera position, illumination of the images, rotation of the object and so on, to compute a point cloud representing the object. The characteristic trait shared by all these approaches is that one can almost change everything about the image, but it is not possible to change the object itself, because one needs to find correspondences between the images. To be able to use different instances of the same object, we used a 3D DPM model that can find different parts of an object in an image, thereby detecting the correspondences between the different pictures, which we then can use to calculate the 3D model. To take this theory to practise, we gave a 3D DPM model, which was trained to detect cars, pictures of different car brands, where no pair of images showed the same vehicle and used the detected correspondences and the Factorization Method to compute the 3D point cloud. This technique leads to a completely new approach in 3D reconstruction, because changing the object itself was never done before.