New Arabic language model outperforms larger competitors

Subscribe for free access to :arrow_forward: Data Points!

The Technology Innovation Institute released Falcon-Arabic, a 7B parameter language model built on the Falcon 3 architecture. The model handles Arabic, English, and several other languages with a 32,000 token context window. Testing shows Falcon-Arabic outperforms other Arabic language models of similar size and some larger models on benchmarks including Arabic MMLU, Exams, MadinahQA, and Aratrust. The developers extended the base model with 32,000 Arabic-specific tokens and used native Arabic datasets for training rather than translated content. The model supports both Modern Standard Arabic and regional dialects, addressing the relative scarcity of Arabic language AI tools. Users can test Falcon-Arabic through an online playground. (Hugging Face)

3 Likes

Pretty good thing you’re doing