Improved Ground-Based Monocular Visual Odometry Estimation using Inertially-Aided Convolutional Neural Networks
AIR FORCE INSTITUTE OF TECHNOLOGY WRIGHT-PATTERSON AFB OH WRIGHT-PATTERSON AFB United States
Pagination or Media Count:
While Convolutional Neural Networks CNNs can estimate frame-to-frame F2F motion even with monocular images, additional inputs can improve Visual Odometry VO predictions. In this thesis, a FlowNetS-based 1 CNN architecture estimates VO using sequential images from the KITTI Odometry dataset 2. For each of three output types full six degrees of freedom 6-DoF, Cartesian translation, and transitional scale, a baseline network with only image pair input is compared with a nearly identical architecture that is also given an additional rotation estimate such as from an Inertial Navigation System INS. The inertially-aided networks show an order of magnitude improvement over the baseline when predicting rotation, but the aided rotation predictions are still worse than the input rotations. Translation predictions are not necessarily helped either. A full-trajectory analysis gives similar results. The INS-aided neural networks are also tested for sensitivity to angular random walk ARW and bias errors in the sensor measurements.