Exploring Indian Cuisine with Khana: A Benchmark Dataset for Food AI

TLDR: The Khana dataset is a new, comprehensive benchmark for Indian cuisine, featuring 131,000 images across 80 categories. It addresses the significant gap in food AI research for Indian dishes, which are complex and diverse. The dataset supports classification, segmentation, and retrieval tasks, providing a structured taxonomy and baseline evaluations using state-of-the-art models like ConvNeXT, which achieved the highest accuracy. Khana aims to advance food recognition technology for Indian culinary traditions.

As the world’s palate expands and interest in diverse culinary experiences grows, the demand for advanced food image models is soaring. These models are crucial for applications ranging from accurate food recognition and recipe suggestions to dietary tracking and automated meal planning. However, despite a wealth of food datasets available, a significant void has existed in comprehensively capturing the rich and varied nuances of Indian cuisine.

This gap is now being addressed with the introduction of Khana, a groundbreaking benchmark dataset specifically designed for food image classification, segmentation, and retrieval of Indian dishes. Khana stands out by establishing a detailed taxonomy of Indian cuisine and offering an impressive collection of approximately 131,000 images, spread across 80 distinct labels, each with a resolution of 500×500 pixels.

The Challenge of Indian Cuisine in AI

Indian cuisine is a vibrant tapestry of flavors and textures, characterized by its vast regional diversity, intricate preparations, and subtle visual distinctions. These complexities, often masked by close resemblances between dishes, pose a unique challenge for image classification algorithms. While food classification has seen considerable effort in Western and other Asian cuisines (predominantly Japanese or Chinese), Indian cuisine has remained largely underrepresented in research.

Khana directly tackles this issue by providing a comprehensive and challenging benchmark. It aims to bridge the gap between academic research and practical development, serving as a valuable resource for researchers and developers alike who are keen on leveraging the rich culinary heritage of India in real-world applications.

Building the Khana Dataset

The creation of Khana involved a meticulous process of data collection and cataloguing. Images were gathered from popular search engines and online food delivery platforms such as Swiggy and Zomato using web crawlers. Duplicate images were carefully removed, and low-quality images were filtered out to ensure the dataset’s integrity.

A key feature of Khana is its innovative taxonomy, which organizes food items hierarchically based on their preparation methods, regional origins, and cultural significance. This structure provides well-defined categories and subcategories, such as ‘breakfast’, ‘main course’, ‘snacks’, and ‘beverages’, with specific dishes like dosa, biryani, gulab jamun, and chaas. The dataset also accounts for multilingual conventions, grouping varied Hinglish keywords for the same dish (e.g., ‘pani puri’, ‘pani poori’, ‘golgappa’). Manual verification by annotators further ensured label accuracy.

Dataset Statistics and Characteristics

The Khana dataset comprises around 131,000 images across 80 different classes, with each image standardized to 500×500 pixels. It is split into training, validation, and test sets with a 70%, 15%, and 15% distribution, respectively. While comprehensive, the dataset does exhibit an imbalanced class distribution, with popular dishes like masala dosa and biryani having more samples than niche items, a common challenge that may require data augmentation techniques for optimal model performance.

Experimental Baselines and Promising Results

To establish initial benchmarks, the creators of Khana evaluated several state-of-the-art models, including Residual Networks (ResNet), EfficientNet, Vision Transformer (ViT), and ConvNeXT. These models, pre-trained on the extensive ImageNet dataset, were fine-tuned on Khana for image classification tasks.

The experimental analysis revealed that the ConvNeXT-S model achieved the highest performance, boasting a top-1 accuracy of 86.72% and a top-5 accuracy of 97.58%. This performance surpassed other leading models, demonstrating the dataset’s utility in pushing the boundaries of food recognition technology.

Also Read:

Looking Ahead

While Khana represents a significant leap forward, the researchers acknowledge limitations such as class imbalance and the need for more fine-grained distinctions in evaluation metrics. Future work includes expanding the dataset with more images for underrepresented categories, incorporating new cuisines, improving annotations, and exploring the potential of multi-modal Large Language Models (LLMs) for querying images and comparing embeddings.

Khana is poised to empower research, fuel innovation, and celebrate the diversity and richness of Indian food, one pixel at a time. For more in-depth information, you can read the full research paper here.

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Financial Sector Fortifies Against Surging AI-Powered Scams

Deloitte’s 2025 Outlook: Navigating Escalating AI Challenges in Human Capital

Salesforce Study Reveals Data Quality is Pivotal for Employee Trust in AI Adoption

Top Executives Sidestep Company AI Guidelines, Fueling Shadow AI Risks

Intel’s Evolving IP Strategy: A Calculated Shift Towards Core AI Innovation

Generative AI Prompts Increased Workforce Surveillance in Indian IT Sector

Exploring Indian Cuisine with Khana: A Benchmark Dataset for Food AI

The Challenge of Indian Cuisine in AI

Building the Khana Dataset

Dataset Statistics and Characteristics

Experimental Baselines and Promising Results

Looking Ahead

Gen AI News and Updates

PASA Unveils New ‘Data for AI’ Guidance to Foster Responsible Innovation in Pensions Administration

AZTECH Introduces Comprehensive AI Training Series to Propel Regional Digital Transformation

HKU Spearheads AI Integration in Hong Kong’s Digital Education Future

Boosting Business Efficiency: A New AI and Big Data Model for Process Optimization

AlphaCast: A New Approach to Time Series Prediction Through Human-AI Collaboration

New Graph Neural Networks Improve Reasoning in Assumption-Based Argumentation

Enhancing AI Reasoning: How Recursive Refinement and Multi-Agent Systems Improve Language Model Performance

ARGUS: A Proactive Framework for Enhancing Autonomous Driving Safety

Generative AI Powers Next-Gen Autonomous Emergency Response

OR-R1: Advancing Automated Optimization with Smart, Data-Efficient AI

Enhancing GUI Agents with Memory: A New Framework for History-Aware Reasoning

ProBench: A Deeper Look into How We Evaluate AI Agents for Mobile Apps

Enhancing Large Language Model Reasoning with Concise Outputs

Ensuring Trust in Autonomous AI: A Two-Layered Monitoring Approach for Agentic Systems

MedFuse: A Multiplicative Approach to Understanding Irregular Clinical Time Series Data

HyperD: A New Framework for More Accurate and Robust Traffic Predictions

Beyond Training: Researchers Propose ‘Model Raising’ for AI with Intrinsic Values

Bridging the Divide: Why AI Needs a Qualitative Revolution

Language Models Enhance Safety Certificate Synthesis for Dynamic Systems

Subscribe to get the latest news and updates