« Attention clairsemée » : différence entre les versions
m (Patrickdrouin a déplacé la page Native Sparse Attention vers Sparse attention) |
Aucun résumé des modifications |
||
| Ligne 1 : | Ligne 1 : | ||
== Définition == | == Définition == | ||
xxxxx | xxxxx | ||
== Français == | == Français == | ||
''' | '''attention clairsemée''' | ||
'''attention parcimonieuse''' | |||
''' | |||
''' | '''attention creuse''' | ||
'''attention clairsemée native''' | |||
== Anglais == | |||
'''sparse attention''' | |||
'''native sparse attention''' | |||
<!-- Long-context modeling is crucial for next-generation language models, yet the high computational cost of standard attention mechanisms poses significant computational challenges. Sparse attention offers a promising direction for improving efficiency while maintaining model capabilities. We present NSA, a Natively trainable Sparse Attention mechanism that integrates algorithmic innovations with hardware-aligned optimizations to achieve efficient long-context modeling. NSA employs a dynamic hierarchical sparse strategy, combining coarse-grained token compression with fine-grained token selection to preserve both global context awareness and local precision. Our approach advances sparse attention design with two key innovations: (1) We achieve substantial speedups through arithmetic intensity-balanced algorithm design, with implementation optimizations for modern hardware. (2) We enable end-to-end training, reducing pretraining computation without sacrificing model performance.--> | |||
==Sources== | |||
[https://espace.etsmtl.ca/id/eprint/3299/ Aroosa Hameed (2023) - attention clairsemée ] | |||
[https://fr.wikipedia.org/wiki/Attention_(apprentissage_automatique) Wikipedia - attention clairsemée] | |||
[https:// | |||
[https://aarnphm.xyz/thoughts/papers/DeepSeek_V3_2.pdf DeepSeek - sparse attention] | |||
[[Catégorie: | [[Catégorie:Publication]] | ||
Version du 31 mars 2026 à 15:23
Définition
xxxxx
Français
attention clairsemée
attention parcimonieuse
attention creuse
attention clairsemée native
Anglais
sparse attention
native sparse attention
Sources
Aroosa Hameed (2023) - attention clairsemée
Contributeurs: Patrick Drouin, wiki





