End-to-End People Detection in Crowded Scenes

Stewart, Russell J.; Andriluka, Mykhaylo; Ng, Andrew Y.

doi:10.1109/cvpr.2016.255

articleJun 1, 2016Closed access

End-to-End People Detection in Crowded Scenes

RJRussell J. Stewart MAMykhaylo Andriluka AYAndrew Y. Ng

Stanford University · Max Planck Institute for Informatics

Indexed incrossref

Abstract

Current people detectors operate either by scanning an image in a sliding window fashion or by classifying a discrete set of proposals. We propose a model that is based on decoding an image into a set of people detections. Our system takes an image as input and directly outputs a set of distinct detection hypotheses. Because we generate predictions jointly, common post-processing steps such as nonmaximum suppression are unnecessary. We use a recurrent LSTM layer for sequence generation and train our model end-to-end with a new loss function that operates on sets of detections. We demonstrate the effectiveness of our approach on the challenging task of detecting people in crowded scenes1.

Citation impact

546

total citations

FWCI: 29.23
Percentile: 100%
References: 42

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Computer science
End-to-end principle
Decoding methods
Set (abstract data type)
Artificial intelligence
Detector
Task (project management)
Image (mathematics)

No related works found for this paper.