TLDR: Code-Switching In-Context Learning (CSICL) is a novel prompting strategy that addresses the ‘translation barrier’ in large language models (LLMs). By gradually transitioning from a target non-English language to English within few-shot demonstrations and instructions, CSICL acts as a linguistic bridge, improving cross-lingual alignment. Experiments across 4 LLMs, 6 datasets, and 10 languages show CSICL consistently outperforms existing cross-lingual in-context learning baselines, with significant gains in both target and unseen languages, especially in low-resource settings and for translation and reasoning tasks, without requiring additional training.
Large Language Models (LLMs) have shown impressive capabilities across many languages, but a significant challenge remains: their tendency to rely on English as a foundational internal representation. This creates what researchers call a ‘translation barrier.’ When an LLM struggles to implicitly translate non-English input into English for reasoning, its performance in that non-English language can drop sharply. This limitation restricts how inclusive and effective LLM-based applications can be for diverse linguistic communities.
Existing methods for cross-lingual in-context learning (X-ICL) often use demonstrations in a single language, which can inadvertently reinforce this English-centric reliance rather than overcoming it. This is where a new approach, Code-Switching In-Context Learning (CSICL), comes into play.
Introducing Code-Switching In-Context Learning (CSICL)
CSICL is a straightforward yet powerful prompting strategy designed to help LLMs navigate this translation barrier. It works by progressively transitioning from a target language (the non-English language) to English within both the demonstrations provided to the model and the instructions given for the task. Essentially, CSICL explicitly guides the LLM’s reasoning process, acting as an ‘implicit linguistic bridge’ that improves how well different languages align within the model’s internal workings.
The core idea is to scaffold the reasoning process. Imagine an LLM being asked a question in Korean. Instead of just giving it Korean examples, CSICL provides examples that start in Korean, gradually introduce English words and phrases, and eventually transition to a fully English equivalent. This gradual shift encourages the LLM to align its cross-lingual representations directly, rather than solely depending on a hidden, often unreliable, internal translation step.
How CSICL Works in Practice
CSICL employs two main components:
- Gradual Code-Switching Few-Shot Demonstrations: These are examples where a query in a target language (e.g., Korean) slowly incorporates English words and phrases, moving from 0% English to 100% English. This is done using a technique called inter-sentential code-switching, where the matrix language (the dominant language) is the target language, and the embedded language is English.
- Gradual Translation Instruction: The LLM is explicitly instructed to follow a similar progressive translation process. It’s told to ‘gradually translate this non-English query into English, then think in English, and finally answer the question,’ explicitly showing its step-by-step translation before providing the answer in the target language.
For a visual representation of this process, you can refer to Figure 1 in the original research paper, which illustrates how a Korean query about the pituitary gland is gradually translated into English within the prompt. To learn more about the technical details and see the full paper, you can visit the research paper here: Code-Switching In-Context Learning for Cross-Lingual Transfer of Large Language Models.
Extensive Experiments and Promising Results
The researchers conducted a wide range of experiments to test CSICL’s effectiveness. They used four different state-of-the-art multilingual LLMs (including Qwen3-32B, deepseek-chat-v3.1, grok-4-fast, and Gemini 2.5 Flash), six diverse datasets, and ten languages. These experiments covered both knowledge-intensive tasks (like general knowledge and cultural knowledge) and reasoning-oriented tasks (like mathematical reasoning and medical question answering).
The results were consistently positive: CSICL significantly outperformed traditional X-ICL baselines. It achieved gains of 3.1 percentage points (p.p.) in target languages and 1.9 p.p. in unseen languages. The improvements were even more dramatic in low-resource language settings, with gains of 14.7% in target languages and 5.3% in unseen languages. This highlights CSICL’s practical value in scenarios where data is scarce.
An ablation study confirmed that both the gradual code-switching demonstrations and the gradual translation instruction are crucial for these improvements. Interestingly, transitioning from a target language to English was more effective than the reverse, supporting the idea that CSICL helps LLMs align with their English-centric latent space. The benefits of CSICL were also shown not to be merely due to providing more sentences in the demonstration, as it consistently outperformed paraphrased monolingual demonstrations.
CSICL showed particularly strong gains in machine translation tasks (6.8 p.p. in target languages) and reasoning-oriented tasks (average gains of 5.4 p.p.). This suggests that by encouraging LLMs to ‘think in English’ through gradual transitions, CSICL helps them leverage their strongest reasoning capabilities.
Also Read:
- Unlocking Multilingual Potential: Steering AI’s Language Experts
- TreePrompt: Enhancing Machine Translation Through Intelligent Example Selection
Moving Towards More Inclusive LLMs
The findings from this research establish code-switching as a principled and robust method for overcoming the translation barrier during LLM inference. By integrating code-switching into in-context learning, CSICL helps LLMs achieve more equitable and effective multilingual performance without requiring additional training or resources. This work opens up new avenues for research, framing language alternation not as a challenge, but as a valuable resource for bridging linguistic gaps and fostering truly inclusive multilingual AI systems.


