Federated Learning Enhances Worker Action Recognition in Smart Manufacturing

TLDR: This research paper introduces a federated learning (FL) framework for pose-based human activity recognition (HAR) in smart manufacturing. Using a custom dataset of upper-body gestures from five participants, the study demonstrates that FL, particularly with a Transformer model, significantly outperforms centralized training in terms of generalization accuracy on both global and unseen external test sets. The findings show that FL not only preserves data privacy by avoiding raw data transfer but also substantially improves cross-user generalization, making it a practical and scalable solution for industrial worker assistance systems.

In the evolving landscape of smart manufacturing, accurately recognizing worker actions in real-time is crucial for boosting productivity, ensuring safety, and fostering seamless human-machine collaboration. Traditional methods for human activity recognition (HAR) often rely on large, centralized datasets. However, in industrial environments, this approach presents significant challenges, particularly concerning data privacy and the logistical complexities of centralizing sensitive information from various sites or workers.

A recent research paper, titled “Federated Action Recognition for Smart Worker Assistance Using FastPose,” addresses these challenges by proposing a federated learning (FL) framework for pose-based human activity recognition. The paper, authored by Vinit Hegiste, Vidit Goyal, Tatjana Legler, and Martin Ruskowski, explores how FL can enable decentralized model training without the need to transfer raw, private data, making it an ideal solution for privacy-sensitive industrial scenarios. You can find the full paper here: Federated Action Recognition for Smart Worker Assistance Using FastPose.

Overcoming Data Privacy and Generalization Hurdles

The core of this research lies in its innovative approach to training HAR models. Instead of pooling all data into one central location, federated learning allows individual clients (in this case, different participants or industrial sites) to train models on their local, private datasets. Only the model updates, not the raw data, are shared with a central server, which then aggregates these updates to create a global model. This method inherently preserves data privacy.

The researchers developed a custom skeletal dataset specifically for smart worker assistance, comprising eight industrially relevant upper-body gestures. This data was collected from five volunteer participants, with each participant’s data treated as a distinct client dataset. To process the video data, a modified FastPose model was used to extract 2D skeletal keypoints, simplifying the original 17 keypoints to a more compact 13-joint representation, which helps reduce noise and improve processing efficiency.

Model Architectures and Training Paradigms

Two types of temporal models were evaluated: a Long Short-Term Memory (LSTM) network and a Transformer encoder. These models were trained and assessed under four distinct paradigms:

Centralized Training: All data from all participants was pooled together and used to train a single model. This represents the traditional approach without privacy considerations.
Local (Per-Client) Training: Each client trained its own model independently, without any collaboration or data sharing.
Federated Learning (FedAvg): Clients trained models locally and shared updates with a central server, which aggregated them using weighted federated averaging.
Federated Ensemble Learning (FedEnsemble): Similar to FL, but the centralized dataset was uniformly partitioned among clients, allowing the researchers to assess the benefits of ensemble learning in a federated setup, even when privacy isn’t the primary concern.

Remarkable Performance Gains

The results were compelling. On a unified global test set, the federated Transformer model achieved 69.5% accuracy, which was a significant 12.4 percentage point improvement over the centralized training approach. The federated LSTM also showed a notable gain of 9.9 percentage points, reaching 59.9% accuracy. These improvements suggest that aggregating diverse local updates in FL acts as a regularization mechanism, preventing overfitting to specific client biases and leading to better generalization.

Even more striking were the results when evaluating the models on an unseen external client – a participant whose data was not included in any training phase. Here, the FL Transformer achieved 64.29% accuracy, a remarkable 52.58 percentage point increase compared to the centralized model. The FedEnsemble Transformer performed even better, reaching 69.98% accuracy, a 58.27 percentage point gain. This demonstrates that FL not only preserves privacy but also substantially enhances the model’s ability to generalize to new, unseen users, which is critical for real-world deployment in diverse industrial settings.

Also Read:

Implications for Smart Manufacturing

The study highlights that federated learning is a highly effective solution for pose-based human activity recognition in industrial environments characterized by distributed and heterogeneous data. It consistently outperformed both centralized training and isolated local models. The observed “ensemble effect” in FedEnsemble learning further suggests that even without strict privacy constraints, FL can be a robust training strategy, especially when dealing with small or distributed datasets common in manufacturing.

This research paves the way for scalable, privacy-aware HAR solutions in smart factories, enabling intelligent assistance systems, enhancing worker safety, and improving productivity without compromising sensitive data. Future work aims to scale this framework to larger client populations, incorporate advanced aggregation methods, and integrate multi-sensor fusion for even greater robustness in challenging industrial environments.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Federated Learning Enhances Worker Action Recognition in Smart Manufacturing

Overcoming Data Privacy and Generalization Hurdles

Model Architectures and Training Paradigms

Remarkable Performance Gains

Implications for Smart Manufacturing

Gen AI News and Updates

Amazon Bedrock’s A2A Protocol: The Catalyst for Next-Gen Cross-Framework Multi-Agent AI Systems

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

Financial Sector Fortifies Against Surging AI-Powered Scams

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates