Semantic-Guided Novel Category Discovery

Weishuai Wang1  Ting Lei1  Qingchao Chen2  Yang Liu1 
1. Wangxuan Institute of Computer Technology, Peking University
2. National Institute of Health Data Science, Peking University
wangweishuai@pku.edu.cn      ting_lei@pku.edu.cn      qingchao.chen@pku.edu.cn
yangliu@pku.edu.cn

A visual comparison of our Semantic-Guided Novel Category Discovery(SNCD) with previous works.

Abstract


The Novel Category Discovery problem aims to cluster an unlabeled set with the help of a labeled set consisting of disjoint but related classes. However, existing models treat class names as discrete one-hot labels and ignore the semantic understanding of these classes. In this paper, we propose a new setting named Semantic-guided Novel Category Discovery (SNCD), which requires the model to not only cluster the unlabeled images but also semantically recognize these images based on a set of their class names. The first challenge we confront pertains to effectively leveraging the class names of unlabeled images, given the inherent gap between the visual and linguistic domains. To address this issue, we incorporate a semantic-aware recognition mechanism. This is achieved by constructing dynamic class-wise visual prototypes as well as a semantic similarity matrix that enables the projection of visual features into the semantic space. The second challenge originates from the granularity disparity between the classification and clustering tasks. To deal with this, we develop a semantic-aware clustering process to facilitate the exchange of knowledge between the two tasks. Through extensive experiments, we demonstrate the mutual benefits of the recognition and clustering tasks, which can be jointly optimized. Experimental results on multiple datasets confirm the effectiveness of our proposed method.

Method Overview



Overview of the proposed architecture.

Results



Comparison with state-of-the-art methods on CIFAR-10, CIFAR-100, and ImageNet on classification and clustering metrics, using task-aware evaluation protocol. ‘-’ means the methods treat class names as discrete one-hot labels, lacking semantic understanding of classes and therefore cannot perform the classification task.

Comparison with state-of-the-art methods on CIFAR-10 and CIFAR-100 on both labeled and unlabeled classes, using task-agnostic evaluation protocol.

t-SNE visualization for all classes on CIFAR10 for UNO and our method.

Materials



Paper


Codes