<b>ENHANCING OBJECT DETECTION FOR AUTONOMOUS DRIVING THROUGH CONTEXT-AWARE SCENE CLASSIFICATION, MULTIMODAL SENSOR FUSION, AND OPTIMIZED MASK R-CNN ARCHITECTURES</b>
<p dir="ltr">This project presents an enhanced Mask R-CNN framework designed to improve real-time object detection for autonomous vehicles (AVs) operating in complex and variable environments. To address the computational challenges and performance degradation under adverse conditions common in AV deployments, the model integrates context-aware scene classification, multimodal sensor fusion, and advanced computational optimizations. Scene classification enables the system to adapt detection strategies dynamically based on driving context—urban, suburban, or highway—improving both relevance and efficiency. Sensor fusion combines data from cameras, LiDAR, and radar to enhance detection robustness in fog, rain, and low-light conditions. To support real-time performance, lightweight backbone networks such as MobileNetV2 and EfficientNet-B0 were used, along with pruning techniques and GPU acceleration to reduce latency and increase processing speed.</p><p dir="ltr">The enhanced framework was evaluated on benchmark datasets including KITTI, Cityscapes, and BDD100K, showing significant performance gains: a 5.6% average mAP improvement over Faster R-CNN, a 4.2% improvement over YOLOX, a 4.8% increase in IoU scores, and an FPS boost from 12.4 to 18.7 using the optimized model. These improvements translate into greater perception reliability, faster reaction times, and increased safety—critical for achieving Level 4 and 5 AV autonomy. Beyond autonomous driving, the framework has broader relevance to real-time perception tasks in fields such as robotics, smart cities, industrial automation, and medical imaging. By combining theoretical advancements in deep learning with practical deployment strategies, this research contributes to the development of safer, more intelligent, and environmentally sustainable transportation systems.</p>