« MetaCLIP » : différence entre les versions
(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' MetaCLIP''' == Anglais == '''MetaCLIP''' The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonst... ») |
Aucun résumé des modifications |
||
Ligne 10 : | Ligne 10 : | ||
'''MetaCLIP''' | '''MetaCLIP''' | ||
The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonstrates that with proper data curation, metadata construction, and training framework design, English and non-English data can actually benefit each other mutually. | The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonstrates that with proper data curation, metadata construction, and training framework design, English and non-English data can actually benefit each other mutually. | ||
MetaCLIP 2, trained on worldwide web-scale image-text pairs, improves zero-shot classification and multilingual benchmarks without system-level confounding factors. | MetaCLIP 2, trained on worldwide web-scale image-text pairs, improves zero-shot classification and multilingual benchmarks without system-level confounding factors. |
Version du 12 août 2025 à 09:52
en construction
Définition
XXXXXXXXX
Français
MetaCLIP
Anglais
MetaCLIP
The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonstrates that with proper data curation, metadata construction, and training framework design, English and non-English data can actually benefit each other mutually. MetaCLIP 2, trained on worldwide web-scale image-text pairs, improves zero-shot classification and multilingual benchmarks without system-level confounding factors.
Source
Contributeurs: wiki
