TLDR: This research paper evaluates five hybrid approaches that combine physics-based and data-driven models for probabilistic building energy modeling, focusing on predicting indoor temperatures with uncertainty. The study found that the ‘Residual’ approach, especially when using a Feedforward Neural Network, consistently delivered the best performance in terms of accuracy and reliability. This method proved particularly effective at capturing unexpected events like prolonged window openings. Additionally, the research demonstrated that ‘Quantile Conformal Prediction’ is a valuable technique for calibrating temperature predictions, improving the reliability of uncertainty estimates. The findings highlight the benefits of integrating physical knowledge with data-driven insights for more robust and reliable building energy management.
Optimizing the performance of building energy systems is crucial, especially given that building operations account for a significant portion of global energy use and CO2 emissions. Building Energy Models (BEMs) are vital tools for simulating how buildings behave thermodynamically and predicting their energy performance, which helps in making optimal control decisions.
Understanding Building Energy Modeling
Historically, BEMs have evolved through various methods. On one end, there are conventional physics-based models. These models are built from fundamental physical laws and detailed descriptions of building characteristics. They are highly consistent with physical principles and offer great interpretability. Tools like EnergyPlus are commonly used for high-fidelity simulations in this domain. However, they can be time and resource-intensive to develop and calibrate.
On the other end, purely data-driven techniques rely on sensor data to learn statistical or machine learning models. These models map inputs like operational schedules and environmental conditions to outputs such as energy consumption or indoor temperature. While often achieving high predictive accuracy, they don’t inherently adhere to physical laws.
Bridging the Gap: Hybrid Approaches
Recently, hybrid approaches have emerged, combining the strengths of both physics-based and data-driven paradigms. These methods aim to leverage the interpretability and physical consistency of physics-based models with the predictive power of data-driven techniques. This study delves into five key hybrid strategies:
- Assistant: Uses the output of a physics-based model as an additional input for a data-driven model.
- Residual: Trains a data-driven model to predict the differences (residuals) between observed data and the physics-based model’s output.
- Surrogate: A data-driven model is trained to act as a low-computation replacement for a physics-based model.
- Augmentation: Real data is enhanced with simulated output from a physics-based model, and a data-driven model is trained on this augmented dataset.
- Constrained: Integrates the discrepancy between physics-based simulation and the final prediction as an additional loss term, regularizing the data-driven model with a physics prior.
Despite progress, two significant research gaps persisted: most hybrid methods focused on deterministic modeling, ignoring inherent uncertainties from factors like weather and occupant behavior, and there was a lack of systematic comparison within a probabilistic modeling framework. This research addresses these gaps by evaluating these five hybrid approaches for probabilistic building energy modeling, specifically focusing on quantile predictions of building thermodynamics in a real-world case study.
The Study’s Approach
The researchers adopted a quantile regression methodology, assuming that uncertainty is primarily aleatoric (from uncontrollable factors like weather and occupant behavior). They used EnergyPlus for the physics-based model and evaluated Quantile Regression (QR), Quantile Feedforward Neural Networks (QNN), and Quantile Random Forest (QRF) for the data-driven components. To improve the calibration of quantile predictions, they employed Conformalized Quantile Regression, a statistical framework for uncertainty quantification.
The study was conducted using data from the Urban Mining and Recycling (UMAR) experimental unit at Empa in Switzerland, analyzing five specific rooms: a living room, two bedrooms, and two bathrooms. Sensor data, including weather, building-level measurements, and room-specific conditions, was collected at a 1-minute resolution and aggregated to 15-minute resolution. The year 2020 data was used for training, and 2021 for testing.
Key Findings and Insights
The study yielded two main findings. First, the performance of hybrid approaches varied across different building room types. However, the Residual approach, particularly when combined with a Feedforward Neural Network (Residual-QNN), performed best on average. This approach showed a clear improvement in accuracy compared to purely data-driven models. Notably, the Residual approach was the only model that produced physically intuitive predictions when applied to out-of-distribution test data, such as prolonged window openings, effectively capturing sudden temperature drops.
Second, Quantile Conformal Prediction proved to be an effective procedure for calibrating quantile predictions in indoor temperature modeling. It generally improved the accuracy of prediction intervals by widening them, ensuring better coverage of actual temperatures.
The Constrained approach, which incorporates a physics-based loss term, showed particular benefits for rooms that are harder to model, such as bathrooms. This is likely due to their complex dynamics and often limited sensor coverage, where integrating a physics-based reference helps compensate for missing information.
Also Read:
- Enhancing Thunderstorm Forecasts with Bayesian Deep Learning
- Diffusion Models Reshape Time Series Forecasting: A Comprehensive Survey
Looking Ahead
This research highlights the significant potential of integrating physics-based and data-driven methods for more robust and reliable building energy modeling, especially when accounting for uncertainties. Future work could explore extending the framework to address epistemic uncertainty (model structure or parameter uncertainty), conducting sensitivity analysis of the physics-based submodel, and performing large-scale experiments across more diverse buildings. Improving the robustness of conformal prediction to handle distribution shifts in time series data is also a promising direction.
For more in-depth information, you can read the full research paper here.


