« MiniMax-M1 » : différence entre les versions

Version du 8 juillet 2025 à 13:17

en construction

Définition

Français

MiniMax-M1

Anglais

MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. MiniMax-M1 is powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism. The model is developed based on our previous MiniMax-Text-01 model, which contains a total of 456 billion parameters with 45.9 billion parameters activated per token. Consistent with MiniMax-Text-01, the M1 model natively supports a context length of 1 million tokens, 8x the context size of DeepSeek R1. Furthermore, the lightning attention mechanism in MiniMax-M1 enables efficient scaling of test-time compute –

Sources

Source : arxiv

Source : huggingface

Source : MiniMax-M1

@@ Ligne 13 : / Ligne 13 : @@
   MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. MiniMax-M1 is powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism. The model is developed based on our previous MiniMax-Text-01 model, which contains a total of 456 billion parameters with 45.9 billion parameters activated per token. Consistent with MiniMax-Text-01, the M1 model natively supports a context length of 1 million tokens, 8x the context size of DeepSeek R1. Furthermore, the lightning attention mechanism in MiniMax-M1 enables efficient scaling of test-time compute –
+== Sources ==
+[https://arxiv.org/abs/2506.13585   Source : arxiv]
-== Source ==
 [https://huggingface.co/MiniMaxAI/MiniMax-M1-80k   Source : huggingface]
+[https://minimax-m1.com/   Source : MiniMax-M1]
 [[Catégorie:vocabulary]]

« MiniMax-M1 » : différence entre les versions

Version du 8 juillet 2025 à 13:17

en construction

Définition

Français

Anglais

Sources

« MiniMax-M1 » : différence entre les versions