Boosting Efficiency in UI Code Generation with Smart Token Compression

TLDR: EfficientUICoder is a new framework that significantly improves the efficiency of Multimodal Large Language Models (MLLMs) in converting UI designs to code (UI2Code). It achieves this by compressing redundant input image tokens and suppressing repetitive output code tokens. The framework uses Element and Layout-aware Token Compression (ELTC), Region-aware Token Refinement (RTR), and Adaptive Duplicate Token Suppression (ADTS). Experiments show it achieves 55-60% compression, reduces computational cost by 44.9%, generated tokens by 41.4%, and inference time by 48.8% without sacrificing webpage quality.

Developing websites efficiently is a constant goal for engineers, and Multimodal Large Language Models (MLLMs) have shown great promise in converting user interface (UI) designs into functional code. This process, known as UI2Code, significantly speeds up website development. However, these advanced models often face a major hurdle: high computational costs. This is primarily due to the large number of input image tokens (representing the visual design) and the extensive output code tokens required to describe a complete webpage.

A recent research paper, titled “EfficientUICoder: Efficient MLLM-based UI Code Generation via Input and Output Token Compression,” delves into this challenge. The authors, Jingyu Xiao, Zhongyi Zhang, Yuxuan Wan, Yintong Huo, Yang Liu, and Michael R. Lyu, conducted a comprehensive study and identified significant redundancies in both the image and code tokens. These redundancies not only inflate computational complexity but also distract the models from focusing on the most crucial UI elements, often leading to unnecessarily long and sometimes invalid HTML files.

To tackle these issues, the researchers propose a novel compression framework called EfficientUICoder. This framework is designed to make UI code generation more efficient through three key components:

Element and Layout-aware Token Compression (ELTC)

The first component, ELTC, focuses on preserving only the essential UI information from the input image. It achieves this by intelligently detecting distinct UI element regions and then constructing a UI element tree. This tree acts as a streamlined representation of the UI’s layout, ensuring that critical visual data is retained while redundant image tokens are discarded.

Region-aware Token Refinement (RTR)

Following ELTC, the RTR module further refines the selected tokens. It uses attention scores—a measure of how much a model “focuses” on certain parts of the input—to identify and discard low-attention tokens from the already selected regions. Crucially, it also integrates high-attention tokens from unselected background areas, recognizing that even seemingly empty spaces can contain important information like background colors. This balanced approach ensures that the most semantically important visual information is preserved across both foreground and background.

Also Read:

Adaptive Duplicate Token Suppression (ADTS)

The third component, ADTS, addresses redundancy in the generated code itself. It dynamically tracks the frequencies of HTML and CSS structures, as well as textual content, during the code generation process. When repetitive patterns are detected, ADTS applies an exponential penalty to reduce the likelihood of generating duplicate tokens. This prevents the model from getting stuck in repetitive loops and helps produce more concise and valid HTML/CSS code.

The extensive experiments conducted by the team demonstrate the effectiveness of EfficientUICoder. The framework achieved a remarkable 55%-60% compression ratio without compromising the quality of the generated webpages. More importantly, it delivered superior efficiency improvements: reducing computational cost by 44.9%, generated tokens by 41.4%, prefill time by 46.6%, and inference time by 48.8% on 34B-level MLLMs. This means faster development cycles and less resource consumption.

The findings of this research highlight that by intelligently compressing both input visual information and output code, it’s possible to significantly enhance the performance and efficiency of MLLM-based UI2Code tasks. The code for EfficientUICoder is available for public access, fostering further research and application in the field. You can find more details in the full paper available here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Boosting Efficiency in UI Code Generation with Smart Token Compression

Element and Layout-aware Token Compression (ELTC)

Region-aware Token Refinement (RTR)

Adaptive Duplicate Token Suppression (ADTS)

Gen AI News and Updates

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

Enhancing Large Language Model Reasoning with Concise Outputs

CoPRIS: Accelerating Large Language Model Training with Smart Concurrency and Importance Sampling

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates