spot_img
HomeResearch & DevelopmentProactive Beam Selection in Connected Vehicles Using AI

Proactive Beam Selection in Connected Vehicles Using AI

TLDR: A new framework uses multi-modal sensing (visual and GPS data) and Transformer models to proactively predict optimal millimeter-wave (mmWave) beams for vehicle-to-vehicle (V2V) communication. This approach significantly reduces beam training overheads, improves prediction accuracy by up to 77.58% for top-15 beams, and lowers average power loss compared to single-modality methods, making V2V communication more efficient and reliable.

Future transportation systems are rapidly evolving with connected and autonomous vehicles, promising significant improvements in traffic efficiency, mobility, and safety. A crucial element for this revolution is seamless, high-throughput, and low-latency communication, especially for sharing vast amounts of sensor data between vehicles, infrastructure, and their surroundings. Millimeter-wave (mmWave) communication, operating at frequencies above 24 GHz, is an ideal candidate for these demands due to its abundant spectrum resources.

However, mmWave signals face a significant challenge: severe atmospheric path loss due to their short wavelengths. To overcome this, beamforming techniques are employed, using phased-antenna arrays to direct narrow beams of RF energy, ensuring sufficient signal strength and reliable links. The current 5G New Radio (5G-NR) standard defines a codebook-based beam training approach where communicating nodes exchange pilot signals and measurements to select the best beam direction. While effective, this method introduces high beam training overheads and reduces available airtime, particularly in highly dynamic vehicular environments where beam directions change constantly.

To address these limitations, a new research paper, “Multi-Modal Sensing Aided mmWave Beamforming for V2V Communications with Transformers”, proposes an innovative multi-modal sensing and fusion learning framework. This framework aims to significantly reduce beam training overheads by proactively predicting the best line-of-sight communication links between vehicles.

How the Framework Works

The core idea involves leveraging out-of-band contextual information from multiple sensors. The proposed framework extracts features individually from visual (camera) and GPS coordinates sensing modalities using specialized encoders. These modality-specific features are then fused together to predict a limited set of top-k beams, effectively narrowing down the search space for the optimal beam.

The system model considers two moving vehicles in a downlink vehicular communication system, where one acts as a receiver and the other as a transmitter. The receiver vehicle is equipped with a camera sensor, while the transmitter carries a GPS receiver. The collected multi-modal data (GPS latitude/longitude and visual images) is preprocessed and fed into two distinct encoders:

  • Position Encoder: This component uses an embedding layer and a multi-layered Transformer encoder to process GPS coordinates, transforming them into a higher-dimensional feature space and capturing rich contextual relationships.

  • Visual Encoder: A variant of the vision transformer, named multi-axis vision transformer (MaxViT), is employed. This model combines the benefits of both convolution and transformer architectures to extract rich visual features from the camera images, adapting well to high-resolution and dense prediction tasks.

After individual feature extraction, the outputs from both encoders are concatenated (fused) to create a unified representation. This fused output is then passed through a multi-layer perceptron network, which uses a Softmax function in its final layer to make a probabilistic prediction, determining the top-k best beams.

Experimental Validation and Results

To demonstrate the generalizability and effectiveness of the proposed framework, a comprehensive experiment was conducted using the DeepSense 6G dataset, which contains real-world multi-modal sensor observations. The study focused on four different vehicle-to-vehicle (V2V) communication scenarios.

The results were highly promising. The proposed framework achieved an impressive accuracy of up to 77.58% in correctly predicting the top-15 beams. This performance significantly outperforms approaches that rely on single modalities (either GPS or visual data alone). Furthermore, the framework incurred a remarkably low average power loss, roughly as low as 2.32 dB, and considerably reduced the beam searching space overheads by 76.56% for top-15 beams compared to standard methods.

Also Read:

Conclusion

This research presents a compelling solution for 60 GHz mmWave enabled connected vehicles. By integrating multi-modal sensing information (visual and GPS) with a Transformer-based fusion framework, the system can proactively predict optimal beams, thereby addressing the critical challenges of high beam training overheads and latency in dynamic V2V environments. The findings underscore the framework’s ability to meet the high throughput and low latency demands essential for the next generation of connected vehicles, paving the way for more efficient and reliable communication in future transportation systems.

Ananya Rao
Ananya Raohttps://blogs.edgentiq.com
Ananya Rao is a tech journalist with a passion for dissecting the fast-moving world of Generative AI. With a background in computer science and a sharp editorial eye, she connects the dots between policy, innovation, and business. Ananya excels in real-time reporting and specializes in uncovering how startups and enterprises in India are navigating the GenAI boom. She brings urgency and clarity to every breaking news piece she writes. You can reach her out at: [email protected]

- Advertisement -

spot_img

Gen AI News and Updates

spot_img

- Advertisement -