TLDR: This research proposes a novel approach to predict bird species presence and migration patterns due to climate change. It combines Convolutional Neural Networks (CNNs) analyzing satellite imagery for landscape features with tabular data models using environmental factors like temperature and elevation. The CNN captures spatial details like forestation and water bodies, while the tabular model uses ecological and geographic data. Both achieve high accuracy (around 85%), offering a scalable and reliable method to understand and predict bird migration, aiding conservation efforts and environmental policy.
Climate change is causing significant shifts in the natural world, forcing many animal species to relocate from their traditional homes. Birds, in particular, are highly sensitive indicators of environmental change, often altering their migratory routes to find suitable nesting areas and food sources. Understanding these shifts is crucial for conservation efforts and for developing effective environmental policies.
Traditional methods for tracking bird distributions often rely on manual observations, which can be limited in terms of geographic coverage, consistency over time, and the resources required. This highlights a growing need for more scalable and predictive approaches, such as those offered by artificial intelligence.
A Dual Approach to Prediction
Researchers Min-Hong Shih and Emir Durakovic from Northeastern University have proposed an innovative solution to accurately model whether bird species are present in a specific habitat. Their method combines two powerful AI techniques: Convolutional Neural Networks (CNNs) and tabular data analysis. This dual approach leverages both visual information from satellite imagery and structured environmental data to predict bird presence across various climates.
The study’s main contributions include a CNN-based method that uses satellite imagery to predict species presence by analyzing local landscape features, and a feature-driven random forest classifier that utilizes climate and topography data. They also provide a comparative analysis of both models using real-world bird occurrence data from the eBird database.
How the Models Work
The research utilizes several key datasets. The eBird dataset provides bird observation data, including latitude, longitude, observation date, and presence information for various bird species in North America. The WorldClim dataset offers environmental features like elevation, precipitation, and temperature at a high resolution. For the visual aspect, the Sentinel-2 dataset provides satellite imagery, allowing the models to analyze landscape characteristics such as forestation, water bodies, and urbanization.
The **tabular model** combines data from eBird and WorldClim. It includes features like latitude, longitude, elevation, precipitation, and temperature, along with the bird species type and its presence. Since eBird primarily records bird presence, the researchers generated “pseudo-absence” observations for locations where a bird was not observed and was not too close to a known observation. This helps the model learn to distinguish between suitable and unsuitable habitats. This data is then fed into machine learning algorithms like Random Forest and Gradient Boosting Decision Trees to predict bird presence or absence.
The **Convolutional Neural Network (CNN) model** focuses on extracting spatial features from satellite imagery. Specifically, a modified ResNet-34 architecture, which is a type of neural network known for its ability to process images, was used. This model was trained to identify important landscape features that influence bird presence, such as water bodies, vegetation, and urban structures. The researchers also experimented with a custom-built CNN to compare its performance against the pre-trained ResNet model.
Key Findings and Performance
Both the tabular and CNN models demonstrated strong performance. The Random Forest model, part of the tabular approach, achieved approximately 86% validation accuracy and around 80% testing accuracy. It also showed high accuracy in separating bird presence from absence, with a 92% validation and 84% testing AUC (Area Under the Curve) accuracy. Interestingly, longitude appeared to be the most significant feature in determining bird locations, suggesting a link to migration towards warmer or coastal areas.
In the CNN approach, the ResNet model significantly outperformed the custom CNN across all metrics, including precision, recall, F1 score, and overall accuracy (averaging 91%). This highlights the advantage of using pre-trained models like ResNet, which have already learned general image features, making them more adaptable to new satellite imagery compared to a custom model trained from scratch on a smaller ecological dataset.
Also Read:
- Compact AI Models Bring Cloud Masking to Satellite On-Board Processing
- MoSAiC: Enhancing Land Cover Classification in Remote Sensing with Hybrid Contrastive Learning
Integrating Insights for a Fuller Picture
The study revealed interesting complementary insights from both approaches. While tabular models identified broad climate and geographic factors like longitude as crucial for large-scale migration patterns, the CNN models, through their analysis of satellite imagery, emphasized micro-level landscape elements such as water bodies, vegetation patterns, and habitat edges. These visual patterns are difficult to capture with tabular data alone but are vital for understanding how birds select habitats during migration.
The researchers suggest that an integrated approach, combining both methodologies, could offer the most comprehensive solution for predicting bird species distribution under climate change scenarios. Such a hybrid model would capture both macro-level climate factors and micro-level landscape characteristics that impact habitat selection.
This research offers a scalable and reliable technique to predict bird distributions, aiming to inspire better climate conservation practices and contribute to policies that combat climate change. For more detailed information, you can read the full research paper here.


