Datasets

Datasets and Other Project Artifacts

Ground Truth Land Use Map

This is the ground truth land use map used for evaluation in our 2019 IEEE Transacations on Multimedia paper "Fine-grained land use classification at the city scale using ground-level images." ([pdf]) That work investigates the problem of mapping land use through the automated analysis of large collections of ground-level images. Since there is no ground truth available at the fine level of classes that we want, we create our own using points of interest (POIs) as indexed by Google Places. This map is therefore in some ways a surrogate ground truth map since it isn't created manually. There are 33 land use classes arranged in a three level hierarchy.

There are three files associated with this dataset:

readme.txt: Contains details about the dataset including the file formats.
LandUseClasses.txt: Lists the land use classes.
LandUseFiles.zip: Contains the .shp and .dbf files containing the shapefiles with the class labels. Please see the readme.txt file for more details.

Please cite our 2019 IEEE Transacations on Multimedia paper if you use this dataset:

Y. Zhu, X. Deng, and S. Newsam, "Fine-grained land use classification at the city scale using ground-level images," IEEE Transactions on Multimedia, 21(9), pp. 1825-1838, 2019.

Activity Recognition Models

This project investigates georeferenced videos for geographic knowledge discovery. For example, our 2017 ACM SIGSPATIAL paper "Large-scale mapping of human activity using geo-tagged videos" ([pdf]) perfoms activity detection in a large collection of georeferenced videos and then maps the results. We have developed several novel action recognition models in the context of this problem. Below, we provide two trained models as either Caffe or PyTorch implementations. They are described in the context of the papers that introduced them.

Model/paper name: Guided Optical Flow Learning
Model download: code
Paper: Yi. Zhu, Z. Lan, S. Newsam, and A. Hauptmann, "Guided optical flow learning," IEEE Conference on Computer Vision and Pattern Recognition (CVPR): Workshop on Workshop on Brave New Motion Representations (BNMR), 2017. [pdf]
Description: This paper aims to learn optical flow in video in a semi-supervised manner. By combining the benefits of supervised training on labeled data and unsupervised learning on un-labeled data, we achieve lower end-point error than other unsupervised methods. The improved motion estimation is shown to improve human action classification.

Model/paper name: Towards Universal Representation for Unseen Action Recognition
Model download: code
Paper: Y. Zhu, Y. Long, Y. Guan, S. Newsam, and L. Shao, "Towards universal representation for unseen action recognition," IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018. [pdf]
Description: This paper aims to recognize unseen actions without training examples. Due to the ambiguity often present in the in class definitions and the large costs in labeling data, we propose a universal representation learning algorithm to improve domain adaptation. We demonstrate the ability of our method to recognize unseen actions without additional fine tuning.