Self-supervised Online Metric Learning with Low Rank Constraint for Scene Categorization----Shenyang Institute Of Automation ,Chinese Academy Of Sciences

Nowadays, machine learning technology plays a central role in many practical systems with visual cognitive ability. Usually, the machine learning model is trained offline with labeled data, which is not updated during the online procedure, e.g. the computer vision system for scene categorization in our case. Unfortunately, for an online practical vision system, the performance of the machine learning model may deteriorate over time as the new incoming data may deviate from the initial training data. In order to handle such an issue, the model needs to be re-trained offline again in the batch mode using both existing and new data, which will be time-consuming. Moreover, if the size of the training dataset is too large, it is difficult for the batch training model to handle all the data in one iteration.

To overcome these problems, the researchers from Shenyang Institute of Automation, the Chinese Academy of Sciences (CAS) propose an Online Metric Learning via Low Rank (OMLLR) and learn a low dimensional representation of the data in a discriminative way, where low rank matrix models can therefore scale to handle substantially many more features and classes than with full-rank dense matrices. The model is proposed as below:

For classification based on our online metric learning model, we define a bi-linear graph model to predict the label of a new incoming testing sample and fuse the information of both labeled and unlabeled data in the fashion of semisupervised learning. Then a unified framework is designed to online self-update the models, which are used to handle online scene categorization, as shown in Fig. 1.

Fig 1. Illustration of Online Metric Learning Procedure: the researchers first collect labeled data and train an initial model. Then, with video data arriving sequentially, after extracting the features, online metric learning and label propagation are used to make a prediction. The confident samples are inserted into the training set queue to online update the model incrementally. (Image provided by CONG Yang et.al)

As shown in Fig.2 and Fig.3, the researchers evaluate the model to online scene categorization and experiments on various benchmark datasets and comparisons with state-of-the-art methods demonstrate the effectiveness and efficiency of the proposed algorithm.

Fig.2 The comparison of the accuracy between our OMLLR and OASIS [2], [35] for home1-6. In each figure, the x-axis corresponds to the iteration steps (10 k for each) and the y-axis is the current accuracy, where the accuracy of “Ours1,” “Ours2” and OASIS is denoted by sold green line, dash red line and dash blue line, respectively. (Image provided by CONG Yang et.al)

Fig.3 Comparison of the performance of OMLLR, OASIS, LMNN, MCML, LEGO and the Euclidean metric in feature space. Each curve shows the precision at top k as a function of k neighbors. The results are averaged across 5 train/test partitions (40 training images, 25 test images), error bars are standard error of the means, black dashed line denotes chance performance. (a) 10 classes. (b) 20 classes. (c) 50 classes.　(Image provided by CONG Yang et.al)

This work was published on the journal of IEEE Transactions on Image Processing, Volume 22, No. 8, August 2013, 3179 - 3191. It was partly supported by the Natural Science Foundation of China (61105013).