« MiMo » : différence entre les versions
Aucun résumé des modifications |
Aucun résumé des modifications |
||
Ligne 12 : | Ligne 12 : | ||
MiMo-7B, a large language model specifically designed for reasoning tasks. The model is optimized across both pre-training and post-training stages to unlock its reasoning potential. Despite having only 7 billion parameters, MiMo-7B achieves superior performance on mathematics and code reasoning tasks, outperforming even much larger models including OpenAI's o1-mini. | MiMo-7B, a large language model specifically designed for reasoning tasks. The model is optimized across both pre-training and post-training stages to unlock its reasoning potential. Despite having only 7 billion parameters, MiMo-7B achieves superior performance on mathematics and code reasoning tasks, outperforming even much larger models including OpenAI's o1-mini. | ||
== | == Sources == | ||
[https://arxiv.org/html/2505.07608v1 Source : arxiv] | [https://arxiv.org/html/2505.07608v1 Source : arxiv] | ||
Version du 23 mai 2025 à 14:30
en construction
Définition
XXXXXXXXX
Français
MiMo-7B
Anglais
MiMo-7B
MiMo-7B, a large language model specifically designed for reasoning tasks. The model is optimized across both pre-training and post-training stages to unlock its reasoning potential. Despite having only 7 billion parameters, MiMo-7B achieves superior performance on mathematics and code reasoning tasks, outperforming even much larger models including OpenAI's o1-mini.
Sources
