I am a scientist based in Tübingen, Germany. Working at the intersection of computational neuroscience, machine learning and computer vision, I would like to understand how neural systems – both biological and artificial – perform visual perception.
Finding optimal stimuli for neurons has long been central for understanding information processing in the brain. However, it is hard because the search space is high-dimensional and sensory information processing is fundamentally nonlinear. With our collaborators Fabian Sinz, Andreas Tolias, Jake Reimer and Xaq Pitkow, we developed Inception Loops: a closed-loop optimization method combining in vivo recordings and in silico nonlinear modeling to find Most Exciting Images (MEIs) we show back to the brain. MEIs drove cells better than control stimuli revealing fascinating properties of mouse V1 cells. MEIs had sharp corners, curved strokes and pointillist textures, deviating strikingly from the standard V1 Gabor model.
The preprint of our work on one-shot object detection and instance segmentation is on arXiv. In this work, we learn to detect and segment instances of previously unseen object categories based on a single visual instruction example. For example, given the image below and either a person (left) or a car (right) as the reference, the goal of the system is to detect all persons (far left) and cars (far right), respectively. Note that in this case, neither persons nor cars were annotated in the training set.
Together with Fabian Sinz, we developed a deep recurrent neural network for predicting the activity of thousands of mouse V1 neurons simultaneously recorded with two-photon microscopy, while accounting for confounding factors such as the animal’s gaze position and brain state changes related to running state and pupil dilation. We investigated how well this large-scale model generalizes to stimulus statistics it was not trained on. While our model trained on natural movies can correctly predict some neural tuning properties in responses to artificial noise stimuli, unadapted transfer is not perfect. However, it can fully generalize from movies to noise and maintain high predictive performance on both stimulus domains by fine-tuning only the final layer’s weights. Check out the preprint on bioRxiv.
I developed an approach to organize and classify neurons in V1 according to their nonlinear computation, ignoring receptive field location and preferred orientation. We use a rotation-equivariant convolutional network to perform weight sharing not only across space, but also across orientation. Our preprint describes the approach and some early results we obtained using recordings of around 6000 neurons in mouse V1.
The final version of our ECCV 2018 paper on visualizing invariances in convolutional neural networks is available. We find that early and mid-level convolutional layers in VGG-19 exhibit various forms of response invariance: near-perfect phase invariance in some units and invariance to local diffeomorphic transformations in others. At the same time, we uncover representational differences with ResNet-50 in its corresponding layers.
Variability in neuronal responses to identical stimuli is frequently correlated across a population. Attention is thought to reduce these correlations by suppressing noisy inputs shared by the population. However, even with precise control of the visual stimulus, the subject’s attentional state varies across trials. In 2016, we put out the hypothesis that such fluctuations in attentional state could be a cause for some of the correlated variability observed in cortical areas. To address this question empirically, we designed a novel paradigm that allows us to manipulate the strength of attentional fluctuations.
In the new paper just published in Nature Communications, we recorded from monkeys’ primary visual cortex (V1) while they were performing this task. We found both a pronounced effect of attentional fluctuations on correlated variability at long timescales and attention-dependent reductions in correlations at short timescales. These effects predominate in layers 2/3, as expected from a feedback signal such as attention.
Our paper on one-shot segmentation in clutter has been accepted to ICML. In this paper, we tackle a one-shot visual search task: based on a single instruction example (the red Φ in the image below), the goal is to find the same letter in a cluttered image that consists of many letters (left) and segment it. This task is pretty hard for computer vision systems, because the image clutter consists of other letters (i.e. very similar statistics), the letters can have arbitrary colors, are drawn by different people, transformed by affine transformations, and have not been seen during training.
Marissa Weis and Max Günthner have started their Master’s thesis projects on March 1st. Mara will be working on image processing using foveated image representations. Max will be investigating nonlinearities in neural responses in primary visual cortex using techniques to visualize convolutional neural networks.
In the review, written by Leon Gatys, Matthias Bethge and myself, we discuss recent advances in texture synthesis using Convolutional Neural Networks (CNNs) that were motivated by visual neuroscience and have led to a substantial advance in image synthesis and manipulation in computer vision. We also discuss how these advanecs can in turn inspire new research in visual perception and computational neuroscience.