Paper on one-shot segmentation at ICML

Our paper on one-shot segmentation in clutter has been accepted to ICML. In this paper, we tackle a one-shot visual search task: based on a single instruction example (the red Φ in the image below), the goal is to find the same letter in a cluttered image that consists of many letters (left) and segment it. This task is pretty hard for computer vision systems, because the image clutter consists of other letters (i.e. very similar statistics), the letters can have arbitrary colors, are drawn by different people, transformed by affine transformations, and have not been seen during training.


Using oracle models with access to various amounts of ground-truth information, we show that in this kind of visual search task, detection and segmentation are two intertwined problems, the solution to each of which helps solving the other. We therefore introduce MaskNet, an improved model that sequentially attends to different locations, generates segmentation proposals to mask out background clutter and selects among the segmented objects. Our findings suggest that such image recognition models based on an iterative refinement of object detection and foreground segmentation may help improving both detection and segmentation in highly cluttered scenes.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.