Contributions de Pitpitt


Rechercher des contributionsaffichermasquer
⧼contribs-top⧽
⧼contribs-date⧽

7 juillet 2025

5 juillet 2025

2 juillet 2025

  • 15:412 juillet 2025 à 15:41 diff hist +399 N Ovis-U1Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' Ovis-U1''' == Anglais == '''Ovis-U1''' Ovis-U1, a 3-billion-parameter model, combines multimodal understanding, text-to-image generation, and image editing, achieving state-of-the-art performance in various benchmarks. == Source == [https://huggingface.co/papers/2506.23044 Source : huggingface] Catégorie:vocabulary »
  • 15:402 juillet 2025 à 15:40 diff hist +1 151 N Modèle multimodal en poupées russesPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''Matryoshka Multimodal Models''' Matryoshka Multimodal Models learn to represent visual content as nested sets of visual tokens that capture information across multiple coarse-to-fine granularities. Our approach offers several unique benefits for LMMs: (1) One can explicitly control the visual granularity per test instance during inference, e.g. , adjusting the... »

26 juin 2025

22 juin 2025

19 juin 2025

  • 13:3619 juin 2025 à 13:36 diff hist +2 Cache clé-valeurAucun résumé des modifications
  • 13:3519 juin 2025 à 13:35 diff hist +1 043 N Cache clé-valeurPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''KV Cache''' a KV cache stores intermediate key (K) and value (V) computations for reuse during inference (after training), which results in a substantial speed-up when generating text. The downside of a KV cache is that it adds more complexity to the code, increases memory requirements (the main reason I initially didn't include it in the book), and can't be us... »
  • 13:3419 juin 2025 à 13:34 diff hist +854 N MiniMax-M1Page créée avec « ==en construction== == Définition == == Français == ''' MiniMax-M1''' == Anglais == '''MiniMax-M1''' MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. MiniMax-M1 is powered by a hybrid Mixture-of-Experts (MoE) architecture combined with a lightning attention mechanism. The model is developed based on our previous MiniMax-Text-01 model, which contains a total of 456 billion parameters with 45.9 billion parameters ac... »

11 juin 2025

7 juin 2025