« SAIL-VL2 » : différence entre les versions

Dernière version du 23 février 2026 à 14:16

XXXXXXXXX

SAIL-VL2

SAIL-VL2

@@ Ligne 9 : / Ligne 9 : @@
 '''SAIL-VL2'''
- An open-source vision-language foundation model designed for comprehensive multimodal understanding and reasoning.
+<!--Vision-language foundation model for comprehensive multimodal understanding and reasoning. It achieves state-of-the-art performance across diverse benchmarks through data curation, progressive training, and sparse MoE architecture.-->
-SAIL-VL2 represents a comprehensive advancement in efficient vision-language modeling through innovations in architecture, training strategies, and data curation. The model successfully demonstrates that smaller, well-designed models can achieve competitive performance with much larger counterparts across diverse multimodal tasks.
-== Source ==
+== Sources ==
+[https://arxiv.org/abs/2509.14033   Source : arxiv]
+[https://github.com/BytedanceDouyinContent/SAIL-VL2   Source : GitHub]
 [https://huggingface.co/papers/2509.14033   Source : huggingface]