TLDR: A new study demonstrates that TimesFM, a time series foundation model, significantly improves the accuracy of US demographic forecasting compared to traditional methods like LSTM, ARIMA, and Linear Regression. Evaluating across six diverse states and multiple racial groups using U.S. Census and FRED data, TimesFM achieved the lowest prediction error in 86.67% of cases, excelling particularly with sparse data from minority populations. This highlights the potential of pre-trained foundation models to enhance demographic analysis and policy-making without extensive fine-tuning.
Understanding and predicting demographic shifts is crucial for effective policy-making in areas like urban planning, healthcare, and economic strategy. Factors such as globalization, economic conditions, geopolitical events, and environmental changes constantly influence population dynamics, making accurate forecasting a significant challenge.
Traditional statistical methods for demographic forecasting, such as Autoregressive Integrated Moving Average (ARIMA) and Linear Regression, often struggle to capture the complex, non-linear patterns inherent in population changes. While deep learning approaches like Long Short-Term Memory (LSTM) networks have offered improvements by learning temporal patterns, the application of cutting-edge time series foundation models in this domain has remained largely unexplored.
Introducing TimesFM for Demographic Prediction
A recent study, titled “Comparative Analysis of Time Series Foundation Models for Demographic Forecasting: Enhancing Predictive Accuracy in US Population Dynamics” by Aditya Akella and Jonathan Farah, addresses this gap. The research systematically evaluates the performance of TimesFM (Time Series Foundation Model), a recently developed time series foundation model, for multi-racial demographic forecasting across diverse U.S. states. The study compares TimesFM against established baselines including LSTM, ARIMA, and Linear Regression.
TimesFM is a decoder-only transformer architecture with 200 million parameters, pre-trained on approximately 100 billion time points from various real and synthetic sources. This extensive pre-training allows it to learn universal temporal representations, enabling strong performance even on unseen data without requiring extensive task-specific fine-tuning. Its patch-based approach efficiently predicts multiple future values simultaneously, making it suitable for long-horizon forecasting.
Methodology and Key Findings
For their experiments, the researchers compiled demographic data for five racial groups (White, Black or African American, American Indian and Alaska Native, Asian, and Native Hawaiian and Other Pacific Islander) across six U.S. states: Alabama, California, Hawaii, New York, Texas, and Wyoming. Population data from 1990–2019 was sourced from the Federal Reserve Economic Data (FRED) system, and 2020–2022 data came from U.S. Census Bureau estimates. The models were evaluated using Mean Squared Error (MSE) as the primary metric, with training data from 1990–2016 and test data from 2017–2022.
The results were compelling: TimesFM achieved the lowest MSE in 86.67% of the test cases. Its performance was particularly strong for minority populations, which often have sparser historical data. For instance, in forecasting the Native Hawaiian demographic in New York, TimesFM reduced MSE by three orders of magnitude compared to LSTM. The model also demonstrated a superior ability to adapt to abrupt trend changes, as observed in the California American Indian population predictions, where traditional models failed to capture a sharp decline.
Also Read:
- UniCast: Enhancing Time Series Forecasting with Visual and Textual Context
- Understanding Time Series Predictions: A Look at LIME and SHAP in Action
Implications and Future Directions
These findings highlight the significant potential of pre-trained time series foundation models to enhance demographic analysis and inform proactive policy interventions. TimesFM’s ability to achieve high accuracy without extensive task-specific fine-tuning suggests that these models can democratize access to accurate population predictions, especially for populations with limited historical records.
While the study provides strong evidence for TimesFM’s capabilities, it acknowledges limitations, such as a relatively short test period and the focus on univariate predictions without incorporating socioeconomic covariates. Future research could explore multivariate extensions, longer forecast horizons, and additional demographic categories. The code and data for reproducing these experiments are available online, fostering further research and application of these advanced models. You can find more details in the full research paper: Comparative Analysis of Time Series Foundation Models for Demographic Forecasting.


