PaperSummary19 : CornerNet

Poonam Saini
2 min readJan 19, 2025

--

The paper introduces CornerNet, a novel (2018, then) object detection framework that identifies objects by detecting the top-left and bottom-right corners of their bounding boxes using a single convolutional neural network (ConvNet). Unlike traditional approaches, it eliminates the need for anchor boxes (a common but computationally expensive component in object detection) by formulating the task as a keypoint detection problem. CornerNet also proposes a new pooling method, corner pooling, to better localize the corners by incorporating contextual information.

The methodology is as follows:

  1. Keypoint Detection: Two heatmaps are predicted, one for top-left corners and another for bottom-right corners, each corresponding to specific object categories. An associative embedding technique groups corresponding corners of the same object by minimizing the embedding distance between them.
  2. Corner Pooling: A pooling mechanism designed to capture corner-specific features. It looks horizontally and vertically to locate the boundaries defining a corner.
  3. Network Architecture: It uses an hourglass network as the backbone, modified to output heatmaps, embeddings and offsets for precise localization. A loss function combining focal loss (for heatmaps), offset regression loss and embedding losses (pull and push losses) is used to train the network.
  4. Post-Processing: Non-maximum suppression (NMS) is applied to refine corner detections followed by embedding-based grouping and bounding box generation.
Image src: https://arxiv.org/pdf/1808.01244

CornerNet outperforms one-stage detectors on the MS COCO dataset with an Average Precision (AP) of 42.2%, making it competitive with two stage detectors.

References:

--

--

Poonam Saini
Poonam Saini

Written by Poonam Saini

PhD Student, Research Associate @ Ulm University

No responses yet