人车密度估计–Towards perspective-free object counting with deep learning

Towards perspective-free object counting with deep learning
ECCV2016
https://github.com/gramuah/ccnn

本文针对人车密度估计问题，主要做了两个工作：1）提出了一个 novel convolutional neural network：Counting CNN (CCNN)，将图像块回归到密度图，2）第二个工作就是提出了一个 scale-aware counting model，Hydra CNN，用于学习 multiscale non-linear regression model

这里我们将人车密度估计问题转为回归问题
这里写图片描述

3 Deep learning to count objects
3.1 Counting objects model
ground truth density map D 真值密度图由高斯核对人车位置进行卷积得到，有了密度图通过积分得到图像中总的人车数

3.2 The Counting CNN
这里写图片描述

这个网络使用了两个 max-pooling，输入尺寸是 72×72 ，输出的密度图尺寸是18×18 变为原来的 1/4

Given a test image, we first densely extract image patches
给定一张测试图像，我们从图像中提出很多重叠的图像块，对图像块进行密度估计，再有这些图像块密度图组合为完整图像的密度估计图

3.3 The Hydra CNN
对于一般的基于回归的计数模型，通常需要对输入特征进行 geometric correction， using an annotated perspective map of the scene
为什么需要这个矫正了？主要还是 perspective distortion
Technically, the perspective distortion exhibited by an image, causes that features extracted from the same object but at different scene depths would have
a huge difference in values. As a consequence, erroneous results are expected by models which uses a single regression function

这里我们提出一个多尺度CNN组合学习网络
这里写图片描述