ATLAS Cross-Lingual Transfer Matrix

License icon

License:

Apache-2.0

Shield icon

Steward:

MIT

Task: NLP

Release Date: 2/19/2026

Format: CSV

Size: 2.36 KB


Share

Description

This matrix is helpful for determining what languages to train a language model with. Given a Target Language, we hope to optimize the performance for (shown as rows), we estimate how beneficial it is to train with each Source Language (shown in the columns). The scores are empirically derived, from 750+ training experiments in the ATLAS multilingual scaling laws paper: https://arxiv.org/pdf/2510.22037. Higher scores indicate greater synergy, whereas lower scores indicate more interference.

Specifics

Licensing

Apache License 2.0 (Apache-2.0)

https://spdx.org/licenses/Apache-2.0.html

Considerations

Restrictions/Special Constraints

No restrictions.

Forbidden Usage

No forbidden usage.

Processes

Intended Use

To help language model developers determine what recipe of languages to use for their target language.

Metadata