Accession Number:



Dataset Curation through Renders and Ontology Matching

Descriptive Note:

Doctoral thesis

Corporate Author:


Personal Author(s):

Report Date:


Pagination or Media Count:



In this thesis we demonstrate the benefits of automated labeled dataset creation for fine-grained visual learning tasks. Specifically, we show that utilizing real-world, non-image information can significantly reduce the human effort needed for building large scale datasets. Computer vision has seen great advances in recent years in a number of complex tasks, such as scene classification, object detection, and image segmentation. A key ingredient in such success stories is the use of large amounts of labeled data. In many cases, the limiting factor is the ability to create these training sets. Issues arise in three forms1 The act of labeling the data can be hard for human annotators, 2 n some cases it is hard to get a representative sample of the feature space, and 3 data for infrequent yet potentially important instances can be completely absent from the training set. Business storefront classification is an example of 1. The number of possible labels is large, and assigning all relevant labels to an image is a time consuming task for annotators. Moreover, when the image contains a business from a country other than their own, annotators can get confused by the foreign language and produce erroneous labels. Annotators are also not consistent in their categorization of businesses into categories. In vehicle viewpoint estimation, the images themselves are hard to come by. Getting sample images of all viewpoints is hard due to bias in the way people photograph cars. Current datasets for this task lack data for many viewpoints. In addition, the labeling task is hard for the annotators. We address these issues by adding automation to the dataset creation process. Our approach is to utilize external information by matching the images to real world concepts. In the case of businesses, when images are mapped to an ontology of geographical entities, we are able to extract multiple relevant labels per image.

Subject Categories:

  • Information Science
  • Cybernetics

Distribution Statement: