« Multi-Token Projection » : différence entre les versions

@@ Ligne 18 : / Ligne 18 : @@
 ''' MTP'''
+<!--Technique that enables the model to predict multiple token un a single forward pass and to strategically pre-plan and generate representations that facilitate more accurate and potentially faster prediction of future tokens. It is used in DeepSeek models and it works by adding specialized moules that predict not only the nest token but also several tokens ahead in the sequence.-->
 == Sources ==

Version du 25 juin 2025 à 15:00