Ultrafast Tracking with Hardware ROI on CMOS Sensor
Manipulating the readout of a CMOS image sensor in real time to enable 2500 positions per second tracking.
When attempting to track objects in video with low latency and computational horsepower, an oft-employed technique is to consider only the region proximate to the current tracked position for where the next position may be. Assuming the object does not jump drastically between subsequent frames, there is no need to search the rest of the image for the new location. This region considered is called the region of interest (ROI) or the windowed region.
This windowing is typically done in software on frame data produced by the image sensor. Here, instead, the windowing was done in hardware -- only a specific subsection of pixel rows of the CMOS image sensor were integrated and read. The hardware used was a LUPA0300 CMOS image sensor (ON Semi, NOIL1SM0300A) connected through a custom shield to a BeagleBone Black. Note all components are off-the-shelf (custom shield had only passives for signal integrity and power conditioning). However, the chip is global shutter rather than the more traditional rolling shutter, and it also exposes hardware registers to programmatically modify the ROI. Global shutter helps avoid tearing that would occur when the target object is moving at high speeds. The sensor's capture sequence was entirely manually controlled using the BeagleBone Black's built-in Programmable Real-Time Unit (PRU) 32-bit microcontrollers with PRU Assembly. One served as a clock, while the other handled frame grabs. This level of manual control afforded us the ability to dynamically modify the x and y start positions and number of columns and rows to be read. A higher level program, written in C, was used for image processing, and returned the updated ROI parameters to the assembly program running the image sensor hardware control.
A lens was placed on the sensor to bring items approximately a meter away into focus. The system was able to robustly track a bright light pointed at the lens and sensor at up to 2500 positions per sec using an ROI of 64 x 64 pixels. The bright spot locator simply identified the pixel with the greatest intensity within the image. The hope was to be able to possibly track objects in a frame, however it soon became abundantly clear that the image processing running on the BeagleBone’s processor soon became the bottleneck. Anything more complex than finding a max value caused the frame rate to plummet. While 2500 pos/sec of bright sport tracking is quite good from off-the-shelf hardware (multi-thousand dollar IR camera motion capture systems operate at 120 Hz), this hardware ROI technique could yield very high tracking speeds when paired with more powerful hardware.
Work conducted at Future Interfaces Group at Carnegie Mellon University with Robert Xiao
This windowing is typically done in software on frame data produced by the image sensor. Here, instead, the windowing was done in hardware -- only a specific subsection of pixel rows of the CMOS image sensor were integrated and read. The hardware used was a LUPA0300 CMOS image sensor (ON Semi, NOIL1SM0300A) connected through a custom shield to a BeagleBone Black. Note all components are off-the-shelf (custom shield had only passives for signal integrity and power conditioning). However, the chip is global shutter rather than the more traditional rolling shutter, and it also exposes hardware registers to programmatically modify the ROI. Global shutter helps avoid tearing that would occur when the target object is moving at high speeds. The sensor's capture sequence was entirely manually controlled using the BeagleBone Black's built-in Programmable Real-Time Unit (PRU) 32-bit microcontrollers with PRU Assembly. One served as a clock, while the other handled frame grabs. This level of manual control afforded us the ability to dynamically modify the x and y start positions and number of columns and rows to be read. A higher level program, written in C, was used for image processing, and returned the updated ROI parameters to the assembly program running the image sensor hardware control.
A lens was placed on the sensor to bring items approximately a meter away into focus. The system was able to robustly track a bright light pointed at the lens and sensor at up to 2500 positions per sec using an ROI of 64 x 64 pixels. The bright spot locator simply identified the pixel with the greatest intensity within the image. The hope was to be able to possibly track objects in a frame, however it soon became abundantly clear that the image processing running on the BeagleBone’s processor soon became the bottleneck. Anything more complex than finding a max value caused the frame rate to plummet. While 2500 pos/sec of bright sport tracking is quite good from off-the-shelf hardware (multi-thousand dollar IR camera motion capture systems operate at 120 Hz), this hardware ROI technique could yield very high tracking speeds when paired with more powerful hardware.
Work conducted at Future Interfaces Group at Carnegie Mellon University with Robert Xiao
© Ishan Chatterjee 2020