University of Tasmania
Browse

File(s) under permanent embargo

Deep learning for 6D pose estimation of objects — A case study for autonomous driving

journal contribution
posted on 2023-05-21, 17:12 authored by Sabera HoqueSabera Hoque, Shuxiang XuShuxiang Xu, Ananda MaitiAnanda Maiti, Yuchen WeiYuchen Wei, Arafat, MY
Nowadays, the potential benefits and implementation of autonomous driving have attracted widespread attention from both industry and academia. This study will solve view-invariant object detection and semantic key-point pose assumptions from a single RGB image. A machine learning method for estimating the absolute pose of an on-road vehicle for autonomous driving from monocular vision alone without the help of additional sensors is a complex task. The main purpose of this work is to identify other vehicles on the road and estimate their exact angular position from a single image with improved accuracy. The focus of the study is to create a new algorithm by applying a potentially deep convoluted neural network followed by a repetitive neural structure for more accurate 6D pose inference. A 6D pose hypothesis is presented in this study, based on a deep hybrid architecture for individual vehicles of an end-to-end approach to a task consisting of a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). In this work, we will use a large-scale dataset consistent with the understanding of a 3D car instance called ApolloCar3D. The data set contains 5,277 real-life street scenes with examples of about 60K cars. By comparison, the ApolloCar3D is twenty times larger than the PASCAL3D+ and KITTI datasets. Ultimately, the idea is to efficiently eliminate motionless cars and predict the next pose given in the speed context, allowing a comprehensive evaluation, and passing the output through LSTM (long short-term memory) with an additional filter layer. The new filter added to the LSTM will efficiently filter and isolate stationary or parking vehicles and focus on on-road vehicles. Since the LSTM has a non-linear high-dimensional hidden memory state, it can preserve the past continuity of each generation’s data history and pay more attention to those road vehicles rather than parked or stationary vehicles to act accordingly. So for each new vehicle, the pose estimator classifier can use LSTM memory and compare the historical pose with the newly filtered data. The successful implementation of this innovative concept will lead to significant improvements in the real-life traffic situation in the field of computer vision and autonomous driving.

History

Publication title

Expert Systems With Applications: An International Journal

Volume

223

Article number

119838

Number

119838

Pagination

1-15

ISSN

0957-4174

Department/School

School of Information and Communication Technology

Publisher

Pergamon-Elsevier Science Ltd

Place of publication

The Boulevard, Langford Lane, Kidlington, Oxford, England, Ox5 1Gb

Repository Status

  • Restricted

Socio-economic Objectives

Artificial intelligence

Usage metrics

    University Of Tasmania

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC