Cross-Lingual Activation Steering for Multilingual Language Models
Summary
This paper introduces Cross-Lingual Activation Steering (CLAS), a training-free inference-time method to improve multilingual performance in large language models. CLAS addresses the performance gap between dominant (high-resource) and non-dominant (low-resource) languages by modulating neuron activations during inference. It selectively boosts shared neurons and suppresses language-specific ones, rebalancing representations to enhance cross-lingual transfer. The method uses parallel inputs to analyze neuron behavior, categorizes neurons into shared and language-specific types, and applies a lightweight steering rule in selected "bridge layers" near the model's output. Experiments on XNLI and XQuAD benchmarks show CLAS improves average accuracy by 1.93% and F1 score by 0.94% for Llama, and 0.45% and 1.10% for Qwen, respectively, while maintaining anchor language (English) performance. The results indicate that effective transfer arises from functional divergence rather than strict alignment to the anchor language. CLAS outperforms a baseline intervention method (Intμ) and demonstrates that targeted activation steering can unlock latent multilingual capacity without modifying model weights.
PDF viewer
Chunks(27)
Chunk 0 · 1,992 chars
Cross-Lingual Activation Steering for Multilingual Language Models
Rhitabrat Pokharel1, Ameeta Agrawal1, Tanay Nagar2
1Department of Computer Science, Portland State University, USA
{pokharel,ameeta}@pdx.edu
2Independent Researcher
tanaynagar7@gmail.com
Abstract
Large language models exhibit strong mul-
tilingual capabilities, yet significant perfor-
mance gaps persist between dominant and non-
dominant languages. Prior work attributes
this gap to imbalances between shared and
language-specific neurons in multilingual repre-
sentations. We propose Cross-Lingual Activa-
tion Steering (CLAS), a training-free inference-
time intervention that selectively modulates
neuron activations. We evaluate CLAS on clas-
sification and generation benchmarks, achiev-
ing average improvements of 2.3% (Acc.)
and 3.4% (F1) respectively, while maintaining
high-resource language performance. We dis-
cover that effective transfer operates through
functional divergence rather than strict align-
ment; performance gains correlate with in-
creased language cluster separation. Our re-
sults demonstrate that targeted activation steer-
ing can unlock latent multilingual capacity in
existing models without modification to model
weights.1
1 Introduction
Large Language Models (LLMs) perform well
on many multilingual tasks, but a substantial per-
formance gap remains between dominant (high-
resource) and non-dominant (low-resource) lan-
guages. This gap is largely attributed to the heavy
skew of pre-training corpora toward dominant lan-
guages, which enables models to develop richer
representations in those languages. While expand-
ing multilingual training data is an obvious solution,
it is often infeasible due to cost and limited data
availability.
Prior work has explored data- and training-based
solutions such as multilingual instruction tuning
(Chen et al., 2024b; Shaham et al., 2024), super-
vised fine-tuning (Chen et al., 2024a), and model
alignment using smaller multilingual datasetsChunk 1 · 1,997 chars
olution, it is often infeasible due to cost and limited data availability. Prior work has explored data- and training-based solutions such as multilingual instruction tuning (Chen et al., 2024b; Shaham et al., 2024), super- vised fine-tuning (Chen et al., 2024a), and model alignment using smaller multilingual datasets (She 1We will release the code to support future work. et al., 2024; Gao et al., 2024; Pokharel et al., 2025). Although effective, these methods still depend on annotated datasets or additional training. A complementary line of work studies multilin- gual behavior at the neuron level as a lightweight alternative. Researchers have identified shared, par- tially shared, and language-specific neurons (Wang et al., 2024; Tang et al., 2024), and shown that shared neurons and overlapping subspaces in mid- dle and upper layers play a central role in cross- lingual transfer (Tezuka and Inoue, 2025; Xu et al., 2025). At the same time, direct manipulation of language-specific neurons has produced mixed re- sults (Mondal et al., 2025). Other work suggests that multilingual models often reason through an English-like latent space before generating outputs in the target language (Etxaniz et al., 2024; Zhao et al., 2024a). Building on these insights, we introduce Cross- Lingual Activation Steering (CLAS), a test-time neuron-level method that rebalances shared and language-specific representations during infer- ence. CLAS gently boosts neurons that encode cross-lingual structure, suppresses those that over- specialize to a single language, and blends the re- sult with the model’s original activations. This steers the model toward representations that better support cross-lingual transfer, improving perfor- mance on non-dominant languages without requir- ing additional data or parameter updates. While Mondal et al. (2025) also explore test- time neuron interventions, their approach differs substantially from ours. Their method overwrites selected neuron activations
Chunk 2 · 1,998 chars
better
support cross-lingual transfer, improving perfor-
mance on non-dominant languages without requir-
ing additional data or parameter updates.
While Mondal et al. (2025) also explore test-
time neuron interventions, their approach differs
substantially from ours. Their method overwrites
selected neuron activations with corpus-level sta-
tistical constants (e.g., mean or percentile values),
producing the same activation regardless of the in-
put and effectively erasing and re-imprinting those
neurons. In contrast, our method preserves propor-
tionality to the model’s actual activations: the final
representation remains a blend with the original
signal rather than a hard replacement.
arXiv:2601.16390v1 [cs.CL] 23 Jan 2026
-- 1 of 11 --
Our main contributions are as follows:
• We propose CLAS, a training-free activation
steering mechanism that selectively modulates
neurons to enhance cross-lingual transfer dur-
ing inference.
• We conduct comprehensive analysis of cross-
lingual representations and find that effective
transfer is driven by functional divergence
rather than proximity to the anchor language.
• We demonstrate CLAS effectiveness across
diverse tasks and models, achieving improve-
ments on both classification and generation
benchmarks while maintaining anchor lan-
guage stability.
2 Cross-Lingual Activation Steering
(CLAS)
We introduce Cross-Lingual Activation Steering
(CLAS), a training-free, test-time intervention for
multilingual models. CLAS has three stages: (i)
construct parallel inputs across languages, (ii) sum-
marize neuron behavior using simple activation
statistics and group neurons into coarse categories,
and (iii) apply a lightweight steering rule that mod-
ulates activations at inference time. This section
describes each component.
2.1 Preliminaries
We define a configuration C =
{L, ℓanchor, B, Tact, β, γ, α}, where L is the
set of languages used for analysis, ℓanchor is an
anchor language (typically the model’s strongest
language), BChunk 3 · 1,995 chars
lightweight steering rule that mod-
ulates activations at inference time. This section
describes each component.
2.1 Preliminaries
We define a configuration C =
{L, ℓanchor, B, Tact, β, γ, α}, where L is the
set of languages used for analysis, ℓanchor is an
anchor language (typically the model’s strongest
language), B is the set of “bridge layers” se-
lected for intervention, and Tact is the activation
threshold for determining neuron activity. Finally,
(β, γ, α) control the strength of boosting selected
activations, suppressing selected activations, and
blending between streams during steering.
2.2 Parallel Input Construction
To analyze neuron behavior across languages, we
require aligned multilingual inputs that express the
same underlying content. For each sample index i,
we construct a set of parallel inputs:
x(i) = {x(i)
ℓ | ℓ ∈ L},
where each x(i)
ℓ is the same text expressed in lan-
guage ℓ. These parallel inputs are used only for
measuring neuron activations and do not update the
model in any way.
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0
20
40
60
80
100
Dead Lang-Specific Partial-Shared All-Shared
Layer Index
% of Neuron Activation
Bridge Layers
(a) Llama
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
0
20
40
60
80
100
Dead Lang-Specific Partial-Shared All-Shared
Layer Index
% of Neuron Activations
Bridge Layers
(b) Qwen
Figure 1: Distribution of types of neuron per layer
across models. Llama has a total of 32 layers and Qwen
has 28 layers. The bridge layers (purple shade) are
selected near the final layers where the partial-shared
language activations are the highest.
2.3 Neuron Statistics and Categorization
To understand the model’s multilingual structure
and identify where CLAS should intervene, we first
analyze neuron activations across languages and
layers. Following (Wang et al., 2024), we group
neurons into four mutually exclusive categories:
deadChunk 4 · 1,996 chars
activations are the highest. 2.3 Neuron Statistics and Categorization To understand the model’s multilingual structure and identify where CLAS should intervene, we first analyze neuron activations across languages and layers. Following (Wang et al., 2024), we group neurons into four mutually exclusive categories: dead (never active), language-specific (active for one language), partial-shared (active for some lan- guages), and all-shared (active for all languages). However, we differ from prior work in how these categories are computed. Wang et al. (2024) assign categories at the instance level: a neuron is con- sidered all-shared if it activates for all languages on a single parallel example. As a result, neuron categories can vary across inputs and tasks. In contrast, we adopt a dataset-level view. We com- pute the mean activation of each neuron using 100 parallel samples from the XQuAD dataset across 12 languages and two models (Llama 3.1 8B and Qwen 2.5 7B) and use this aggregate statistic to as- sign each neuron to a single category. This makes each neuron’s category a stable property of the model and language set rather than something that changes across examples. -- 2 of 11 -- Layer-wise Activation Patterns For each layer, we compute the percentage of neurons in each cat- egory. Figure 1 summarizes the resulting distribu- tions. We observe several consistent patterns: Early layers. Early layers contain a high propor- tion of partial-shared neurons, suggesting that these layers are involved in mapping language-specific or language-related features into shared representa- tions (Tang et al., 2024; Zhang et al., 2025). Middle layers. In the middle layers, we observe an increase in all-shared neurons alongside a rise in dead neurons and a reduction in partial-shared neurons. This is consistent with prior findings that mid-level representations become more language- agnostic and semantically focused (Zhang et al., 2025; Xu et al., 2025). Final layers. In the
Chunk 5 · 1,998 chars
ddle layers, we observe an increase in all-shared neurons alongside a rise in dead neurons and a reduction in partial-shared neurons. This is consistent with prior findings that mid-level representations become more language- agnostic and semantically focused (Zhang et al., 2025; Xu et al., 2025). Final layers. In the upper layers, partial-shared neurons become more prevalent again, and dead neurons decrease. This aligns with evidence that language-specific processing re-emerges near the output as the model prepares to generate text in the target language (Tang et al., 2024; Zhang et al., 2025). These observations are descriptive and do not imply that these categories are sharply separated or functionally pure; rather, they provide a coarse sum- mary of how multilingual structure is distributed across layers. Selecting Bridge Layers Based on these statis- tics, we select a small set of bridge layers for inter- vention. In both LLaMA and Qwen, layers 24-29 and 24-25, respectively, exhibit relatively high pro- portions of partial-shared neurons together with relatively low proportions of dead and language- specific neurons. We therefore hypothesize that in- tervening in these layers provides a useful balance: representations are shared enough to support cross- lingual steering, while not yet so specialized that small perturbations directly disrupt surface gener- ation. We exclude the final two layers from inter- vention, as these layers are known to be strongly specialized for producing language-specific outputs and are particularly sensitive to perturbation (Zhao et al., 2024b). All layer selections are fixed prior to evaluation and are not tuned on test data. 2.4 Activation Steering via CLAS Given an input xℓ in language ℓ, the model com- putes an intermediate MLP activation using the standard SwiGLU transformation: h = σ(Wgx) ⊙ (Wux), where Wg and Wu are the gating and up-projection matrices, and σ is the nonlinear activation function. For non-anchor languages ℓ̸
Chunk 6 · 1,999 chars
st data. 2.4 Activation Steering via CLAS Given an input xℓ in language ℓ, the model com- putes an intermediate MLP activation using the standard SwiGLU transformation: h = σ(Wgx) ⊙ (Wux), where Wg and Wu are the gating and up-projection matrices, and σ is the nonlinear activation function. For non-anchor languages ℓ̸ = ℓanchor, CLAS applies a lightweight, deterministic modification to this activation using masks derived from neuron categories, with the goal of adjusting the relative contribution of shared and language-specific neu- rons. Partial-shared neuron adjustment Let Mshared denote the mask over partial-shared neurons. Em- pirically, these neurons often account for a large fraction of the active dimensions, which can reduce the relative influence of language-specific features. We therefore apply a controlled rescaling: h1 = h ⊙ (1 + βMshared), where β controls the magnitude of the adjustment applied to partial-shared neurons. This operation does not introduce new information but changes the relative weighting of existing components. Language-specific neuron adjustment Simi- larly, let Mspec denote the mask over language- specific neurons. These neurons are typically under-represented. We apply a complementary ad- justment: h2 = h1 ⊙ (1 − γMspec). where γ controls the strength of the adjustment applied to language-specific neurons. Blend Adjustment We then blend the modified activation with the original: hfinal = (1 − α)h + α h2, where α controls the overall strength and direction of the intervention. Positive α increases the influence of the adjusted representation, emphasizing the relative contribu- tion of shared neurons. Negative α reduces the influence of the adjusted representation and increases the relative contribu- tion of language-specific components. In practice, we treat α as a steering coefficient whose effect is model- and task-dependent; both positive and neg- ative values can be beneficial in different regimes. The adjusted activation is then
Chunk 7 · 1,988 chars
he influence of the adjusted representation and increases the relative contribu- tion of language-specific components. In practice, we treat α as a steering coefficient whose effect is model- and task-dependent; both positive and neg- ative values can be beneficial in different regimes. The adjusted activation is then passed through the down-projection: y = Wdhfinal. -- 3 of 11 -- 2.5 Anchor Language Handling For the anchor language ℓanchor (i.e. English), no modification is applied: hfinal = h. We keep the anchor activations untouched because the anchor language serves as a stable reference point for cross-lingual alignment. Modifying it would risk introducing unnecessary distortion into a representation that is already well supported by the model. All adjustments are therefore applied only to non-anchor languages, allowing them to shift relative to a fixed reference. 3 Experimental Setting We describe the models, datasets, and evaluation setup used to assess CLAS in a controlled multilin- gual setting, focusing on cross-lingual transfer to non-English languages. 3.1 Models We use LLaMA 3.1 8B Instruct (Grattafiori et al., 2024) and Qwen 2.5 7B Instruct (Team, 2024). All models are kept frozen; CLAS is applied only at inference time and does not modify model parame- ters. We intervene only in the selected bridge layers defined by the configuration. 3.2 Datasets and Languages We evaluate on two benchmarks: XNLI (Conneau et al., 2018) is a natural language inference dataset with parallel data in 15 languages. For evaluation, we consider a constrained genera- tion approach. We first prompt the model with the premise and hypothesis and instruct it to classify the relationship by predicting a single integer token: “0” (Entailment), “1” (Neutral), or “2” (Contradic- tion). Then we extract the logits for these specific target tokens from the final position and select the class with the highest probability. The results are reported in terms of Accuracy and F1
Chunk 8 · 1,998 chars
ct it to classify the relationship by predicting a single integer token: “0” (Entailment), “1” (Neutral), or “2” (Contradic- tion). Then we extract the logits for these specific target tokens from the final position and select the class with the highest probability. The results are reported in terms of Accuracy and F1 scores. XQuAD (Artetxe et al., 2019) is a multilingual question answering dataset in 12 languages. For evaluation, we consider a generative reading com- prehension task where the model is provided with the context paragraph and question, then prompted to generate the answer span directly. We employ greedy decoding with a strict maximum new to- ken limit of 32. The results are reported using the token-level F1 score. 3.3 Evaluation and Implementation Neuron statistics are computed using the parallel subset (100 samples) where the same example is available in all analysis languages; downstream evaluation uses the full dataset for each language separately. We evaluate a mix of high- and low-resource languages and use English as the anchor language, since prior work shows multilingual models often rely on English-like internal representations. All other languages are treated as non-anchor and used to measure cross-lingual effects. All experiments are run on a single NVIDIA A40 GPU with a maximum sequence length of 512. Samples are processed individually (no batch- ing) to ensure accurate activation capture. Steering parameters (β, γ, α) are selected via grid search (more details in §4.4). 4 Results and Discussion In this section, we will discuss the findings of the experiments with CLAS and how it improves mul- tilingual performance. 4.1 Downstream Performance Tables 1 and 2 summarize the main results, com- paring CLAS against base model and Intμ baseline method from Mondal et al. (2025) where mean value is used during intervention. XNLI classification On XNLI, CLAS improves average accuracy over the base models, although the magnitude and consistency of
Chunk 9 · 1,996 chars
nstream Performance Tables 1 and 2 summarize the main results, com- paring CLAS against base model and Intμ baseline method from Mondal et al. (2025) where mean value is used during intervention. XNLI classification On XNLI, CLAS improves average accuracy over the base models, although the magnitude and consistency of the gains differ. For Llama, CLAS yields an average improvement of +1.93 accuracy points over the baseline, which is statistically significant ( p < 0.05). These gains are driven by large improvements in several languages, including Urdu (+7.00), Chinese (+5.89), Hindi (+5.80), and Greek (+5.39), although some lan- guages show regressions (e.g., Bulgarian, Swahili, and Vietnamese). This indicates that CLAS can substantially help underperforming languages but may also introduce instability in some cases. For Qwen, the average improvement is smaller (+0.45) but more consistent across languages, and also sta- tistically significant (p < 0.001). Most gains fall between +0.2 and +0.8, and none are strongly nega- tive. This suggests that CLAS is more conservative on Qwen: it yields smaller but more stable im- provements, reflecting Qwen’s stronger baseline multilingual representations. On both models, per- formance on the anchor language (English) remains -- 4 of 11 -- Llama Qwen l base Int μ CLAS base Int μ CLAS ar 38.88 -0.76 +4.71 60.45 +0.09 +0.45 bg 42.28 -1.60 -2.14 60.75 -0.01 +0.71 de 43.65 +2.12 +1.62 66.22 +0.11 +0.77 el 40.58 +0.72 +5.39 58.02 +0.04 +0.48 en 52.63 0.00 0.00 71.73 +0.05 +0.05 es 46.95 -0.36 -0.72 65.56 -0.05 +0.73 fr 44.53 -1.42 +2.34 64.47 -0.14 +0.66 hi 41.78 -0.68 +5.80 55.56 -0.05 +0.51 ru 43.75 +0.04 +1.14 61.87 -0.17 +0.25 sw 41.10 +0.32 -3.06 40.31 -0.35 +0.21 th 40.68 +3.71 +4.09 57.80 -0.02 +0.38 tr 44.17 -0.36 +0.66 59.18 -0.08 +0.72 ur 36.09 +0.28 +7.00 51.07 +0.09 +0.43 vi 44.81 -1.96 -3.73 62.11 +0.07 +0.31 zh 44.63 -3.25 +5.89 63.81 0.00 +0.08 Avg. 43.10 -0.21 +1.93∗ 59.93 -0.03 +0.45⋆ σ 2.82 2.77 3.19 6.78 6.84
Chunk 10 · 1,997 chars
1.87 -0.17 +0.25 sw 41.10 +0.32 -3.06 40.31 -0.35 +0.21 th 40.68 +3.71 +4.09 57.80 -0.02 +0.38 tr 44.17 -0.36 +0.66 59.18 -0.08 +0.72 ur 36.09 +0.28 +7.00 51.07 +0.09 +0.43 vi 44.81 -1.96 -3.73 62.11 +0.07 +0.31 zh 44.63 -3.25 +5.89 63.81 0.00 +0.08 Avg. 43.10 -0.21 +1.93∗ 59.93 -0.03 +0.45⋆ σ 2.82 2.77 3.19 6.78 6.84 6.87 Table 1: Accuracy and improvements on XNLI using Llama and Qwen. en is removed during statistical anal- ysis. Asterisks denote statistical significance of the im- provement over the baseline (paired t-test): ∗p < 0.05, ⋆p < 0.001. Llama Qwen l base Int μ CLAS base Int μ CLAS ar 23.20 -1.95 +1.01 28.73 +0.26 +5.06 de 38.54 -5.46 +6.25 34.77 -2.76 -0.96 el 25.18 +1.50 +0.23 37.06 -0.06 -1.51 en 23.47 0.00 0.00 48.88 +0.11 0.00 es 34.23 +1.08 +1.84 38.53 -2.53 +2.63 hi 29.47 -0.09 -0.10 33.12 -2.12 -5.28 ro 30.01 +6.97 -0.03 34.48 -0.48 +0.48 ru 27.32 -0.08 -0.06 27.58 -0.58 +8.94 th 19.23 -3.40 -0.46 28.51 +7.48 +0.98 tr 23.74 -2.11 +1.14 29.19 -2.19 -2.31 vi 34.91 +5.08 +1.67 31.67 -5.67 +1.36 zh 12.84 +0.05 -0.25 20.24 -1.24 +3.76 Avg. 26.85 +0.13 +0.94 32.73 -0.82 +1.10 σ 7.43 8.75 8.83 5.16 5.45 4.95 Table 2: F1 scores and improvements on XQuAD for Llama and Qwen models. en is removed during sta- tistical analysis. Statistical analysis (paired t-test) indi- cates that the improvements for both Llama (p = 0.10) and Qwen (p = 0.33) are not statistically significant (p > 0.05). unchanged, indicating that CLAS does not degrade anchor-language behavior while modifying non- anchor representations. Compared to Int μ, CLAS consistently performs better: Int μ is slightly harm- ful on Llama and neutral on Qwen, whereas CLAS yields positive mean gains on both. XQuAD generative question answering On XQuAD, CLAS also improves average perfor- mance, but the effects are more variable and not ar bg de el es fr hi ru sw th tr ur vi zh 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Cosine Similarity to English Before CLAS After CLAS (a) XNLI ar de el es hi ro
Chunk 11 · 1,994 chars
mean gains on both. XQuAD generative question answering On XQuAD, CLAS also improves average perfor- mance, but the effects are more variable and not ar bg de el es fr hi ru sw th tr ur vi zh 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Cosine Similarity to English Before CLAS After CLAS (a) XNLI ar de el es hi ro ru th tr vi zh 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Cosine Similarity to English Before CLAS After CLAS (b) XQuAD Figure 2: Cosine similarity with English across lan- guages on each task using Llama model. Similar results (Appendix A) were obtained with Qwen model. statistically significant. On Llama, CLAS improves average token-level F1 by +0.94, with notable gains for German (+6.25) and Spanish (+1.84), but small regressions for others. Int μ has a negligible effect. On Qwen, CLAS improves average F1 by +1.10, with large gains for Russian (+8.94) and Arabic (+5.06), but also substantial regressions for Hindi (–5.28) and Turkish (–2.31). This indicates that CLAS can unlock large improvements for some languages but can also strongly harm others in the generative setting. Compared to XNLI, XQuAD exhibits much higher variance, with both large pos- itive and negative swings. This is expected given that span extraction is sensitive to token boundaries, local lexical cues, and translation artifacts, making it inherently less stable than classification. Statistical perspective Our statistical analysis shows that CLAS significantly improves cross- lingual performance on discriminative tasks with- out increasing variance across languages. On XNLI, both Llama and Qwen show significant gains after excluding English (p < 0.05 and p < 0.001), while cross-language variance remains stable (p > 0.05), indicating a uniform improve- ment rather than trade-offs across languages. On XQuAD, average F1 also increases, but the gains are not statistically significant, likely due to higher variability in generative evaluation. -- 5 of 11 -- −1.0 −0.5 0.0 0.5 1.0 CLAS
Chunk 12 · 1,997 chars
cross-language variance remains stable (p > 0.05), indicating a uniform improve- ment rather than trade-offs across languages. On XQuAD, average F1 also increases, but the gains are not statistically significant, likely due to higher variability in generative evaluation. -- 5 of 11 -- −1.0 −0.5 0.0 0.5 1.0 CLAS Improvement −1.0 −0.5 0.0 0.5 1.0 Alignment Change viswbgestrrudefr tharelhi zh ur Trend (a) XNLI Llama −1.0 −0.5 0.0 0.5 1.0 CLAS Improvement −1.0 −0.5 0.0 0.5 1.0 Alignment Change es thsw zhtrviar elruhibgur frde Trend (b) XNLI Qwen −1.0 −0.5 0.0 0.5 1.0 CLAS Improvement −1.0 −0.5 0.0 0.5 1.0 Alignment Change th zh hi ru roelartrvies de Trend (c) XQuAD Llama −1.0 −0.5 0.0 0.5 1.0 CLAS Improvement −1.0 −0.5 0.0 0.5 1.0 Alignment Change hi treldero thvi eszhar ru Trend (d) XQuAD Qwen Figure 3: Relationship between alignment change to English (y-axis) and CLAS performance improvement (x-axis) for each language across different tasks and models. 4.2 Cross-Lingual Alignment Analysis In this subsection, we analyze the impact of our activation steering mechanism on representation alignment and examine how these geometric shifts relate to cross-lingual performance gains on XNLI and XQuAD. Decoupling geometric alignment from func- tional performance Figure 2 shows cosine sim- ilarity between each target language and English in the bridge layers before and after CLAS. Across tasks and models, CLAS generally reduces simi- larity to English rather than increasing it. This ef- fect is strongest for Llama, especially on XQuAD, where several languages move noticeably farther from English, while Qwen shows smaller and more uniform changes. This pattern matches our neuron analysis (Sec- tion 2.3). CLAS rebalances partial-shared and language-specific neurons, reducing over-reliance on English-centric features and allowing more target-language structure to influence predictions. As a result, CLAS improves performance not by pulling languages closer
Chunk 13 · 1,997 chars
hanges. This pattern matches our neuron analysis (Sec- tion 2.3). CLAS rebalances partial-shared and language-specific neurons, reducing over-reliance on English-centric features and allowing more target-language structure to influence predictions. As a result, CLAS improves performance not by pulling languages closer to English, but by relax- ing excessive alignment. The effect is larger on XQuAD than on XNLI, likely because span extrac- tion is more sensitive to surface form. Reducing English alignment can help match target-language expressions, though it can also increase variance and occasional regressions. Performance gains are driven by functional di- vergence. We fit a linear regression to quantify the correlation between these metrics. Figure 3 shows the scatter plot with fitted trend line. It an- alyzes the trade-off between task performance im- provement and the shift in alignment relative to English. Contrary to the design of the CLAS mech- anism, which explicitly aims to amplify shared neurons (h·(1+βMshared)) and attenuate language- specific ones (h · (1 − γMspec)), we observe an in- verse relationship on the XNLI task. This anomaly is likely attributable to a negative value for α, which effectively reverses the polarity of the mod- ulation. For both Llama and Qwen, larger perfor- mance gains are often accompanied by a reduction in cosine similarity to the English anchor (nega- tive slope), suggesting that functional optimization for reasoning tasks may require diverging from the strict geometric space of the anchor language. In contrast, the XQuAD benchmarks display a mixed relationship, indicating that for extraction-based tasks, the correlation between shared-neuron acti- vation and geometric alignment is less predictable. 4.3 Representation Space Visualization We apply t-SNE dimensionality reduction to visual- ize the aggregated hidden states of the English an- chor and all target languages in a shared 2D space, and provide useful intuition about
Chunk 14 · 1,997 chars
orrelation between shared-neuron acti- vation and geometric alignment is less predictable. 4.3 Representation Space Visualization We apply t-SNE dimensionality reduction to visual- ize the aggregated hidden states of the English an- chor and all target languages in a shared 2D space, and provide useful intuition about representational structure. Representations are extracted from the model’s penultimate layer for all languages. Fig- ures 4 shows separate visualizations for before- CLAS and after-CLAS conditions. Language cen- troids are computed as the mean position of each language’s embeddings. Visual analysis reveals three distinct geometric phenomena. CLAS reshapes representations without collaps- ing them into English. The “After” visualiza- tions confirm that the mechanism drives functional divergence rather than assimilation. Language clus- ters remain distinct and in some cases become more clearly separated as seen in Figure 4. This suggests that CLAS improves performance by reorganiz- ing language-specific representation spaces, rather than by enforcing assimilation into a single shared geometry. Moderate reorganization is associated with larger gains. On XNLI, this reorganization ap- pears broadly beneficial, as many languages im- prove when their representations become more structured and distinct. On XQuAD, the rela- -- 6 of 11 -- −60 −30 0 30 60 t-SNE Dimension 1 −60 −30 0 30 60 t-SNE Dimension 2 ar bg de el en es fr hi ru sw th tr ur vi zh −60 −30 0 30 60 t-SNE Dimension 1 −60 −30 0 30 60 t-SNE Dimension 2 ar bg de el en es fr hi ru sw th tr ur vi zh (a) XNLI Llama (Before vs After) −30 0 30 60 t-SNE Dimension 1 −60 −30 0 30 60 t-SNE Dimension 2 ar bg de el enesfr hi ru sw th tr ur vi zh −60 −30 0 30 60 t-SNE Dimension 1 −60 −30 0 30 60 t-SNE Dimension 2 ar bg de el en es fr hi ru sw th tr ur vi zh (b) XNLI Qwen (Before vs After) −30 0 30 t-SNE Dimension 1 −30 0 30 t-SNE Dimension 2 ar de elen es hi ro ru th tr vi zh −30 0
Chunk 15 · 1,999 chars
30 60 t-SNE Dimension 2 ar bg de el enesfr hi ru sw th tr ur vi zh −60 −30 0 30 60 t-SNE Dimension 1 −60 −30 0 30 60 t-SNE Dimension 2 ar bg de el en es fr hi ru sw th tr ur vi zh (b) XNLI Qwen (Before vs After) −30 0 30 t-SNE Dimension 1 −30 0 30 t-SNE Dimension 2 ar de elen es hi ro ru th tr vi zh −30 0 30 t-SNE Dimension 1 −30 0 30 t-SNE Dimension 2 ar de el en es hi ro ru th tr vi zh (c) XQuAD Llama (Before vs After) −30 0 30 t-SNE Dimension 1 −30 0 30 t-SNE Dimension 2 ar de el en es hi ro ru th tr vi zh −30 0 30 60 t-SNE Dimension 1 −30 0 30 t-SNE Dimension 2 ar de el en es hi ro ru th tr vi zh (d) XQuAD Qwen (Before vs After) Figure 4: t-SNE visualization of cross-lingual representations before (left) and after (right) CLAS intervention. Each point represents a sentence embedding, with colors denoting languages. English, marked with stars, is the anchor language. tionship is more selective: languages that start in a mixed or ambiguous region and become more clearly separated after CLAS tend to show larger gains (e.g., de, ar, es on Llama; ru, ar, es on Qwen), while languages that are already highly separated show smaller changes. This suggests that CLAS is most helpful when it resolves representational over- lap, rather than when representations are already well-formed. Distance to English does not explain improve- ments. Neither initial nor final proximity to the English cluster reliably predicts performance changes. Some languages far from English (e.g., ur, hi on XNLI–Llama) improve substantially, while some closer languages improve little or regress. After CLAS, high-performing languages are dis- tributed across the space rather than concentrated near English. This indicates that CLAS effective- ness depends on how representations are reorga- nized internally, not simply on how close they are to the anchor language. 4.4 Optimal Steering Intensity and Direction We tune three parameters: β (boosting shared neurons), γ (suppressing
Chunk 16 · 1,995 chars
space rather than concentrated
near English. This indicates that CLAS effective-
ness depends on how representations are reorga-
nized internally, not simply on how close they are
to the anchor language.
4.4 Optimal Steering Intensity and Direction
We tune three parameters: β (boosting shared
neurons), γ (suppressing language-specific neu-
rons), and α (overall steering strength and direc-
tion). A grid search over β ∈ {0.2, 0.4, 0.6} and
γ ∈ {0.1, 0.2, 0.4} shows that moderate values
work best. We select β = 0.4 and γ = 0.2, which
emphasize shared structure without introducing ex-
cessive noise. We tune α separately. A grid search
−8 −6 −4 −2 0 2
40
45
50
55
60
65
Llama Qwen
Alpha
Accuracy
(a) XNLI
−4 −2 0 2 4
0.1
0.15
0.2
0.25
0.3
0.35
Llama Qwen
Alpha
F1 Score
(b) XQuAD
Figure 5: Optimal value of α depends on the model and
the task.
0.2 0.4 0.6
41
42
43
44
45
γ (β=0)
β (γ=0)
Value
Accuracy
Figure 6: Impact of β and γ on XNLI accuracy. The
blue line represents the performance when varying β
(while fixing γ=0), and the red line represents varying
γ (while fixing β=0). In all experiments, α=1.
(Figure 5) on 200 samples per language per task
shows that the optimal magnitude and direction
of steering vary: some settings benefit from rein-
forcing shared representations (positive α), while
others benefit from amplifying language-specific
signals (negative α). This confirms that the balance
between shared semantics and language-specific
precision is task- and model-dependent, consistent
with prior work (Tang et al., 2024; Wang et al.,
-- 7 of 11 --
Question Llama CLAS
Was benannte die Generalkonferenz für Maß
und Gewicht 1960 nach Tesla?
SI-Maßeinheit der magnetischen Flussdichte.
stop. Die SI-Maßeinheit der magnetischen
Flussdichte
SI-Maßeinheit der mag-
netischen Flussdichte.
(What did the General Conference on Weights
and Measures name after Tesla in 1960?)
(SI unit of magnetic flux density. stop. The SI
unit of magnetic flux density)
(SI unit ofChunk 17 · 1,986 chars
? SI-Maßeinheit der magnetischen Flussdichte. stop. Die SI-Maßeinheit der magnetischen Flussdichte SI-Maßeinheit der mag- netischen Flussdichte. (What did the General Conference on Weights and Measures name after Tesla in 1960?) (SI unit of magnetic flux density. stop. The SI unit of magnetic flux density) (SI unit of magnetic flux density.) Die Lutherbibel beeinflusste wessen englische Übersetzung der Bibel? William Tyndale Die Lutherbibel beeinflusste die englische Bibel von William Tyndale. Tyn- dale war ein eng William Tyndale. (The Luther Bible influenced whose English translation of the Bible?) (William Tyndale’s English Bible was influ- enced by the Luther Bible. Tyndale was a close) (William Tyndale.) Table 3: Comparison of outputs for German XQuAD samples. English translations provided via Google Translate. 2024). Figure 6 further shows that relying exclu- sively on either shared or language-specific neu- rons is suboptimal; best performance comes from a calibrated balance between the two. 4.5 Qualitative Analysis Table 3 shows the qualitative impact of CLAS on German generation. In the examples, the baseline model suffers from severe repetition loops and ver- bosity leakage. In contrast, CLAS successfully sup- presses these behaviors. This demonstrates CLAS effectively generates concise response while main- taining accuracy. 5 Related Work Cross-lingual Transfer Cross-lingual transfer in multilingual LLMs has been widely studied, with prior work showing that transfer quality varies across tasks, languages, and models. For exam- ple, Hu et al. (2025) analyze factors that influence cross-lingual performance on reasoning tasks. Many approaches improve transfer through ad- ditional training. These include adding multi- lingual data during instruction tuning (Shaham et al., 2024), combining supervised fine-tuning with preference alignment (Lai et al., 2024), and constructing new multilingual pre-training datasets (He et al., 2025). Other work uses
Chunk 18 · 1,995 chars
approaches improve transfer through ad- ditional training. These include adding multi- lingual data during instruction tuning (Shaham et al., 2024), combining supervised fine-tuning with preference alignment (Lai et al., 2024), and constructing new multilingual pre-training datasets (He et al., 2025). Other work uses translation- based fine-tuning (Lee et al., 2025), layer-wise fine-tuning (Bandarkar et al., 2025), or language- specific adapters (Zhao et al., 2025). Continued pre-training has also been shown to improve trans- fer for some language pairs (Wu et al., 2025). An alternative line of work uses prompts rather than parameter updates. For example, Tanwar et al. (2023) use multilingual in-context examples, and Yoo et al. (2025) study in-context learning in code- switching settings. Neuron Behavior and Cross-lingual Transfer Several studies examine cross-lingual transfer at the neuron level. Huang et al. (2025) show that activation similarity across languages is associ- ated with better transfer. Other work identifies language-specific and shared neurons (Tang et al., 2024; Zhang et al., 2025; Tezuka and Inoue, 2025), and shows that shared neurons often concentrate in middle and upper layers (Xu et al., 2025). Results on directly intervening on neurons are mixed. Mondal et al. (2025) find that manip- ulating language-specific neurons yields limited gains, while Wang et al. (2024) show that neuron roles vary by task and model. Our work differs in that we apply test-time steering that blends rather than overwrites activations, preserving representa- tional structure. We show that the direction and magnitude of steering matter, and that carefully controlled neuron-level interventions can improve cross-lingual transfer. 6 Conclusion We presented CLAS, a training-free activation steering method that improves cross-lingual trans- fer by rebalancing shared and language-specific neurons at inference time. CLAS improves per- formance across both classification and
Chunk 19 · 1,998 chars
y controlled neuron-level interventions can improve cross-lingual transfer. 6 Conclusion We presented CLAS, a training-free activation steering method that improves cross-lingual trans- fer by rebalancing shared and language-specific neurons at inference time. CLAS improves per- formance across both classification and generation tasks, and our analysis shows that gains come from functional divergence rather than forcing represen- tations to align closely with the anchor language. Languages benefit most by shifting from ambigu- ous regions into distinct clusters, whereas initial anchor proximity does not reliably predict success. These patterns are task-dependent, highlighting the need for task-aware cross-lingual methods. Fu- ture work could explore adaptive steering strategies that adjust intervention strength based on represen- tational structure, better understand saturation ef- fects in already well-separated clusters, and extend CLAS to other modalities and training settings. -- 8 of 11 -- Limitations We evaluate CLAS on two multilingual instruction- tuned LLMs (Qwen and Llama) and across multi- ple multilingual benchmarks spanning NLI (XNLI) and QA (XQuAD). However, our analysis and sug- gested intervention are anchored to English i.e. we quantify alignment shifts via cosine similarity to an English anchor. This may not reflect behavior under alternative references/anchors or truly lan- guage pair-specific settings. Additionally, it is im- portant to note that CLAS is not uniformly benefi- cial: while average gains for languages are positive, we also observe language-specific regressions (e.g, Hindi, Greek). This indicates that test-time steering can be brittle for certain model/language combina- tions. Finally, our mechanistic analysis does not isolate the role of attention heads or other circuit components, limiting the granularity of causal at- tribution. Ethical Considerations CLAS is a test-time activation intervention and can change model behavior in
Chunk 20 · 1,994 chars
ing can be brittle for certain model/language combina- tions. Finally, our mechanistic analysis does not isolate the role of attention heads or other circuit components, limiting the granularity of causal at- tribution. Ethical Considerations CLAS is a test-time activation intervention and can change model behavior in ways that are not always predictable across languages, including occasional performance degradations. As with other steering approaches, such interventions can be re-purposed in undesirable ways (e.g, modulating output with- out transparency). Thus, we recommend caution and task-specific testing and validation before real- world deployment. Our experiments use publicly- available benchmarks and do not involve human subjects or personal data. References Mikel Artetxe, Sebastian Ruder, and Dani Yogatama. 2019. On the cross-lingual transferability of mono- lingual representations. CoRR, abs/1910.11856. Lucas Bandarkar, Benjamin Muller, Pritish Yuvraj, Rui Hou, Nayan Singhal, Hongjiang Lv, and Bing Liu. 2025. Layer swapping for zero-shot cross-lingual transfer in large language models. In The Thirteenth International Conference on Learning Representa- tions. Nuo Chen, Zinan Zheng, Ning Wu, Ming Gong, Dong- mei Zhang, and Jia Li. 2024a. Breaking language barriers in multilingual mathematical reasoning: In- sights and observations. In Findings of the Associa- tion for Computational Linguistics: EMNLP 2024, pages 7001–7016, Miami, Florida, USA. Association for Computational Linguistics. Pinzhen Chen, Shaoxiong Ji, Nikolay Bogoychev, An- drey Kutuzov, Barry Haddow, and Kenneth Heafield. 2024b. Monolingual or multilingual instruction tun- ing: Which makes a better alpaca. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1347–1356, St. Julian’s, Malta. Associa- tion for Computational Linguistics. Alexis Conneau, Ruty Rinott, Guillaume Lample, Ad- ina Williams, Samuel R. Bowman, Holger Schwenk, and Veselin Stoyanov. 2018. Xnli:
Chunk 21 · 1,997 chars
: Which makes a better alpaca. In Findings of the Association for Computational Linguistics: EACL 2024, pages 1347–1356, St. Julian’s, Malta. Associa- tion for Computational Linguistics. Alexis Conneau, Ruty Rinott, Guillaume Lample, Ad- ina Williams, Samuel R. Bowman, Holger Schwenk, and Veselin Stoyanov. 2018. Xnli: Evaluating cross- lingual sentence representations. In Proceedings of the 2018 Conference on Empirical Methods in Natu- ral Language Processing. Association for Computa- tional Linguistics. Julen Etxaniz, Gorka Azkune, Aitor Soroa, Oier Lopez de Lacalle, and Mikel Artetxe. 2024. Do mul- tilingual language models think better in English? In Proceedings of the 2024 Conference of the North American Chapter of the Association for Computa- tional Linguistics: Human Language Technologies (Volume 2: Short Papers), pages 550–564, Mexico City, Mexico. Association for Computational Lin- guistics. Changjiang Gao, Hongda Hu, Peng Hu, Jiajun Chen, Jixing Li, and Shujian Huang. 2024. Multilingual pre- training and instruction tuning improve cross-lingual knowledge alignment, but only shallowly. In Pro- ceedings of the 2024 Conference of the North Amer- ican Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pages 6101–6117, Mexico City, Mexico. Association for Computational Linguistics. Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al- Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, and 1 others. 2024. The llama 3 herd of models. arXiv preprint arXiv:2407.21783. Kaiyu He, Tong Zhou, Yubo Chen, Delai Qiu, Sheng- ping Liu, Kang Liu, and Jun Zhao. 2025. Semantic pivots enable cross-lingual transfer in large language models. arXiv preprint arXiv:2505.16385. Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, and Shujian Huang. 2025. Large language models are cross- lingual knowledge-free reasoners. In Proceedings of the
Chunk 22 · 1,998 chars
and Jun Zhao. 2025. Semantic pivots enable cross-lingual transfer in large language models. arXiv preprint arXiv:2505.16385. Peng Hu, Sizhe Liu, Changjiang Gao, Xin Huang, Xue Han, Junlan Feng, Chao Deng, and Shujian Huang. 2025. Large language models are cross- lingual knowledge-free reasoners. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Lin- guistics: Human Language Technologies (Volume 1: Long Papers), pages 1525–1542, Albuquerque, New Mexico. Association for Computational Linguistics. Chongxuan Huang, Yongshi Ye, Biao Fu, Qifeng Su, and Xiaodong Shi. 2025. From neurons to semantics: Evaluating cross-linguistic alignment capabilities of large language models via neurons alignment. In Proceedings of the 63rd Annual Meeting of the As- sociation for Computational Linguistics (Volume 1: Long Papers), pages 28956–28974, Vienna, Austria. Association for Computational Linguistics. -- 9 of 11 -- Wen Lai, Mohsen Mesgar, and Alexander Fraser. 2024. LLMs beyond English: Scaling the multilingual ca- pability of LLMs with cross-lingual feedback. In Findings of the Association for Computational Lin- guistics: ACL 2024, pages 8186–8213, Bangkok, Thailand. Association for Computational Linguistics. Jungseob Lee, Seongtae Hong, Hyeonseok Moon, and Heuiseok Lim. 2025. Cross-lingual optimization for language transfer in large language models. In Pro- ceedings of the 63rd Annual Meeting of the Associa- tion for Computational Linguistics (Volume 1: Long Papers), pages 15100–15119, Vienna, Austria. Asso- ciation for Computational Linguistics. Soumen Kumar Mondal, Sayambhu Sen, Abhishek Singhania, and Preethi Jyothi. 2025. Language- specific neurons do not facilitate cross-lingual trans- fer. In The Sixth Workshop on Insights from Negative Results in NLP, pages 46–62, Albuquerque, New Mexico. Association for Computational Linguistics. Rhitabrat Pokharel, Yufei Tao, and Ameeta Agrawal. 2025. Capo: Confidence aware
Chunk 23 · 1,998 chars
d Preethi Jyothi. 2025. Language- specific neurons do not facilitate cross-lingual trans- fer. In The Sixth Workshop on Insights from Negative Results in NLP, pages 46–62, Albuquerque, New Mexico. Association for Computational Linguistics. Rhitabrat Pokharel, Yufei Tao, and Ameeta Agrawal. 2025. Capo: Confidence aware preference optimiza- tion learning for multilingual preferences. arXiv preprint arXiv:2511.07691. Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, and Matan Eyal. 2024. Mul- tilingual instruction tuning with just a pinch of mul- tilinguality. In Findings of the Association for Com- putational Linguistics: ACL 2024, pages 2304–2317, Bangkok, Thailand. Association for Computational Linguistics. Shuaijie She, Wei Zou, Shujian Huang, Wenhao Zhu, Xiang Liu, Xiang Geng, and Jiajun Chen. 2024. MAPO: Advancing multilingual reasoning through multilingual-alignment-as-preference optimization. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10015–10027, Bangkok, Thai- land. Association for Computational Linguistics. Tianyi Tang, Wenyang Luo, Haoyang Huang, Dong- dong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, and Ji-Rong Wen. 2024. Language-specific neurons: The key to multilingual capabilities in large language models. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers), pages 5701–5715, Bangkok, Thailand. Association for Computational Linguistics. Eshaan Tanwar, Subhabrata Dutta, Manish Borthakur, and Tanmoy Chakraborty. 2023. Multilingual LLMs are better cross-lingual in-context learners with align- ment. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers), pages 6292–6307, Toronto, Canada. Association for Computational Linguistics. Qwen Team. 2024. Qwen2.5: A party of foundation models. Hinata Tezuka and Naoya Inoue. 2025. The transfer
Chunk 24 · 1,986 chars
with align- ment. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Vol- ume 1: Long Papers), pages 6292–6307, Toronto, Canada. Association for Computational Linguistics. Qwen Team. 2024. Qwen2.5: A party of foundation models. Hinata Tezuka and Naoya Inoue. 2025. The transfer neu- rons hypothesis: An underlying mechanism for lan- guage latent space transitions in multilingual LLMs. In Proceedings of the 2025 Conference on Empiri- cal Methods in Natural Language Processing, pages 31730–31780, Suzhou, China. Association for Com- putational Linguistics. Weixuan Wang, Barry Haddow, Minghao Wu, Wei Peng, and Alexandra Birch. 2024. Sharing matters: Analysing neurons across languages and tasks in llms. arXiv preprint arXiv:2406.09265. Linjuan Wu, Hao-Ran Wei, Huan Lin, Tianhao Li, Baosong Yang, Fei Huang, and Weiming Lu. 2025. Enhancing LLM language adaption through cross- lingual in-context pre-training. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 27140–27154, Suzhou, China. Association for Computational Linguistics. Yuemei Xu, Kexin Xu, Jian Zhou, Ling Hu, and Lin Gui. 2025. Linguistic neuron overlap patterns to facilitate cross-lingual transfer on low-resource lan- guages. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 27646–27661, Suzhou, China. Association for Computational Linguistics. Haneul Yoo, Jiho Jin, Kyunghyun Cho, and Alice Oh. 2025. Code-switching in-context learning for cross- lingual transfer of large language models. arXiv preprint arXiv:2510.05678. Shimao Zhang, Zhejian Lai, Xiang Liu, Shuaijie She, Xiao Liu, Yeyun Gong, Shujian Huang, and Jiajun Chen. 2025. How does alignment enhance llms’ mul- tilingual capabilities? a language neurons perspective. arXiv preprint arXiv:2505.21505. Yiran Zhao, Wenxuan Zhang, Guizhen Chen, Kenji Kawaguchi, and Lidong Bing. 2024a. How do large language models handle
Chunk 25 · 1,995 chars
Shuaijie She, Xiao Liu, Yeyun Gong, Shujian Huang, and Jiajun Chen. 2025. How does alignment enhance llms’ mul- tilingual capabilities? a language neurons perspective. arXiv preprint arXiv:2505.21505. Yiran Zhao, Wenxuan Zhang, Guizhen Chen, Kenji Kawaguchi, and Lidong Bing. 2024a. How do large language models handle multilingualism? In The Thirty-eighth Annual Conference on Neural Informa- tion Processing Systems. Yiran Zhao, Wenxuan Zhang, Guizhen Chen, Kenji Kawaguchi, and Lidong Bing. 2024b. How do large language models handle multilingualism? Advances in Neural Information Processing Systems, 37:15296– 15319. Yiran Zhao, Wenxuan Zhang, Huiming Wang, Kenji Kawaguchi, and Lidong Bing. 2025. AdaMergeX: Cross-lingual transfer with large language models via adaptive adapter merging. In Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Hu- man Language Technologies (Volume 1: Long Pa- pers), pages 9785–9800, Albuquerque, New Mexico. Association for Computational Linguistics. A Alignment vs. CLAS Performance To further understand the relationship between alignment and CLAS performance, we generate -- 10 of 11 -- −0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05 Alignment Change ur zh hi el ar th fr de ru tr es bg sw vi −2 0 2 4 6 CLAS Improvement (a) XNLI - Llama −0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05 Alignment Change de es tr bg fr hi el ar ur th vi ru sw zh 0.1 0.2 0.3 0.4 0.5 0.6 0.7 CLAS Improvement (b) XNLI - Qwen −0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05 Alignment Change de es vi tr ar el ro ru hi zh th 0 1 2 3 4 5 6 CLAS Improvement (c) XQuAD - Llama −0.25 −0.20 −0.15 −0.10 −0.05 0.00 0.05 Alignment Change ru ar zh es vi th ro de el tr hi −4 −2 0 2 4 6 8 CLAS Improvement (d) XQuAD - Qwen Figure 7: Heatmap showing alignment change vs. CLAS improvement heatmaps in Figure 7. Across all four settings, most languages exhibit a negative alignment change, in- dicating that
Chunk 26 · 1,025 chars
.20 −0.15 −0.10 −0.05 0.00 0.05 Alignment Change ru ar zh es vi th ro de el tr hi −4 −2 0 2 4 6 8 CLAS Improvement (d) XQuAD - Qwen Figure 7: Heatmap showing alignment change vs. CLAS improvement heatmaps in Figure 7. Across all four settings, most languages exhibit a negative alignment change, in- dicating that CLAS generally reduces the similarity of non-English representations to English. At the same time, many of these same languages show positive performance improvements, as indicated by darker shading. Figure 8 presents the plots for cosine similarity with English across langauges on each task using the Qwen model. ar bg de el es fr hi ru sw th tr ur vi zh 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Cosine Similarity to English Before CLAS After CLAS (a) XNLI-Qwen ar de el es hi ro ru th tr vi zh 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Cosine Similarity to English Before CLAS After CLAS (b) XQuAD-Qwen Figure 8: Cosine similarity with English across lan- guages on each model and task. -- 11 of 11 --