MJKAN: A Hybrid Neural Network Bridging KAN and MLP for Enhanced Efficiency and Expressiveness

TLDR: MJKAN (Modulation Joint KAN) is a novel neural network architecture that combines the non-linear expressive power of Kolmogorov-Arnold Networks (KANs) with the computational efficiency of Multilayer Perceptrons (MLPs). It achieves this by integrating a FiLM-like modulation mechanism with Radial Basis Function (RBF) activations. Empirical results show MJKAN’s superior performance in function regression and competitive, stable performance in image and text classification, though its generalization in classification tasks is sensitive to the number of basis functions, requiring careful tuning to prevent overfitting.

Neural networks are at the heart of modern artificial intelligence, powering everything from image recognition to natural language processing. Two prominent architectures are Multilayer Perceptrons (MLPs) and the more recently introduced Kolmogorov-Arnold Networks (KANs). While MLPs are known for their efficiency and widespread use, KANs offer a unique approach by replacing fixed activation functions with learnable, univariate functions on each connection, inspired by the Kolmogorov-Arnold superposition theorem.

However, despite their theoretical elegance and promise in specific tasks like symbolic regression, KANs have faced practical hurdles. They often come with high computational costs and haven’t consistently outperformed traditional MLPs in general classification tasks. This has led researchers to explore hybrid models that can combine the best of both worlds.

Introducing MJKAN: A Hybrid Approach

A new research paper, “Bridging KAN and MLP: MJKAN, a Hybrid Architecture with Both Efficiency and Expressiveness”, introduces the Modulation Joint KAN (MJKAN). This novel neural network layer is designed to overcome the limitations of conventional KANs by integrating a FiLM (Feature-wise Linear Modulation)-like mechanism with Radial Basis Function (RBF) activations. The core idea is to create a hybrid architecture that leverages the non-linear expressive power of KANs while maintaining the computational efficiency typically associated with MLPs.

In an MJKAN layer, each input dimension is first processed by radial basis functions, and then a learned affine transformation (scaling and offset) is applied. This FiLM-like operation effectively reintroduces learnable linear weights into the KAN framework without sacrificing the non-linear capabilities of kernel activations. This design allows MJKAN to dynamically adjust the influence of different input regions, much like how an MLP’s weights work, resulting in a highly flexible layer.

Performance Across Diverse Tasks

The researchers put MJKAN through a rigorous empirical validation across various benchmarks, including function regression, image classification (MNIST, CIFAR-10/100), and natural language processing (AG News, SMS Spam).

For function regression tasks, MJKAN demonstrated superior approximation capabilities, consistently outperforming MLPs. Its performance improved as the number of basis functions increased, highlighting its strength in modeling complex, non-linear functions with localized or compositional structures.

However, in general classification tasks, the results were more nuanced. In image classification, MJKAN’s performance was competitive with MLPs, but it revealed a critical dependency on the number of basis functions. A smaller basis size was found to be crucial for better generalization, especially on more complex datasets like CIFAR-100. This suggests that while more basis functions increase theoretical expressiveness, they also raise the risk of overfitting if not carefully tuned to the data’s complexity.

In natural language processing tasks, MJKAN proved to be a robust and stable alternative to MLPs, delivering consistent performance across different basis sizes, although it didn’t consistently surpass the MLP baseline. This indicates its viability in text classification settings, particularly when combined with transformer-derived embeddings.

Also Read:

Key Takeaways

MJKAN represents a significant step towards creating more practical and versatile KAN-inspired models. It successfully combines the function approximation strengths of KANs with the efficiency of MLPs. The research underscores the importance of the basis size as a key hyperparameter, directly controlling the model’s geometric complexity and its susceptibility to overfitting. By offering a flexible, general-purpose building block, MJKAN paves the way for future hybrid architectures that are both powerfully expressive and computationally tractable.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

MJKAN: A Hybrid Neural Network Bridging KAN and MLP for Enhanced Efficiency and Expressiveness

Introducing MJKAN: A Hybrid Approach

Performance Across Diverse Tasks

Key Takeaways

Gen AI News and Updates

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates