Hyperparameter Search on FasterRCNN

Object detection is a common computer vision problem, the goal of which is to detect instances of a given class in an image. FasterRCNN is a popular deep learning network architecture for performing object detection. FasterRCNN does detection with a two-step approach by first identifying bounding boxes that may potentially contain objects and then subsequently classifying each bounding box. Today, we’ll walk through how to train FasterRCNN to perform object detection using Determined and PyTorch.

Segmentation example

Getting Started Locally

For this example, we’ll be training FasterRCNN on the Penn-Fudan Database for Pedestrian Detection and Segmentation. Thanks to the PyTorch FasterRCNN tutorial, its easy to get started. We will adapt code from this tutorial to run on Determined so that we can easily scale up training, run a hyperaprameter search, and achieve a better final validation IOU.

Do More With Determined

In advance, we’ve organized the tutorial code in a Determined PyTorch Trial Interface. By organizing the model this way, we can use Determined to track our experiments, scale to distributed training, and do hyperparameter tuning. To get started, you’ll need to install Determined, and configure the Determined cli. The code for this example can be found here.

To run this example, first install Determined either locally or on the cloud. Since, we will be running a hyperparameter search consisting of many training runs, we recommend running on the cloud.

Once you have Determined installed, you can train FasterRCNN and track the progress of training with:

det experiment create const.yaml .

The configuration of this experiment is defined in const.yaml:

description: fasterrcnn_coco_pytorch_const
data:
  url: https://determined-ai-public-datasets.s3-us-west-2.amazonaws.com/PennFudanPed/PennFudanPed.zip
hyperparameters:
  learning_rate: 0.005
  momentum: 0.9
  weight_decay: 0.0005
  global_batch_size: 2
searcher:
  name: single
  metric: val_avg_iou
  smaller_is_better: false
  max_length:
    batches: 800
entrypoint: model_def:ObjectDetectionTrial

For full documentation about how to configure experiments, check out the Determined experiment configuration documentation. Today, we will modify this configuration to run a hyperparameter search.

In our new configuration, called adaptive.yaml, we will add sweeps of the learning_rate and momentum hyperparameters:

hyperparameters:
  learning_rate:
    type: double
    minval: 0.0001
    maxval: 0.001
  momentum:
    type: double
    minval: 0.2
    maxval: 1.0

We will then configure the searcher with the search algorithm name, the optimization metric and the size of the hyperparameter search. We will use the state-of-the-art ASHA algorithm. We’ll start with a small experiment, running 30 trials of 8 batches of training each.

searcher:
  name: adaptive_asha
  metric: val_avg_iou
  smaller_is_better: false
  max_length:
    batches: 8
  max_trials: 30

The final configuration looks like:

description: fasterrcnn_coco_pytorch_adaptive_search
data:
  url: https://determined-ai-public-datasets.s3-us-west-2.amazonaws.com/PennFudanPed/PennFudanPed.zip
hyperparameters:
  learning_rate:
    type: double
    minval: 0.0001
    maxval: 0.001
  momentum:
    type: double
    minval: 0.2
    maxval: 1.0
  weight_decay: 0.0005
  global_batch_size: 2
searcher:
  name: adaptive_asha
  metric: val_avg_iou
  smaller_is_better: false
  max_length:
    batches: 8
  max_trials: 30
entrypoint: model_def:ObjectDetectionTrial

This can be run from the command line:

det experiment create adaptive.yaml .

When training has completed, your model should obtain a validation IOU score of ~52.

Faster R-CNN results

Next, to further improve the IOU score, we’ll increase the size of the hyperparameter search to run for 300 trials.

searcher:
  name: adaptive_asha
  metric: val_avg_iou
  smaller_is_better: false
  max_length:
    batches: 8
  max_trials: 300

Determined automatically parallelizes our hyperparameter search across multiple machines. Our Determined cluster is configured to spin up up to 40 agents during training, so even though we’re running 300 trials, training only takes minutes.

When training is complete, your model should obtain a validation IOU score of ~67.

Faster R-CNN better results

We encourage you to give Determined a spin by trying this example or any others available in the Determined repository. If you have any questions along the way, hop on our community Slack or reach out our GitHub – we’d love to help!