« MetaCLIP » : différence entre les versions


(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' MetaCLIP''' == Anglais == '''MetaCLIP''' The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonst... »)
 
Aucun résumé des modifications
 
(Une version intermédiaire par le même utilisateur non affichée)
Ligne 10 : Ligne 10 :
'''MetaCLIP'''
'''MetaCLIP'''


The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonstrates that with proper data curation, metadata construction, and training framework design, English and non-English data can actually benefit each other mutually.
The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonstrates that with proper data curation, metadata construction, and training framework design, English and non-English data can actually benefit each other mutually.
   
   
  MetaCLIP 2, trained on worldwide web-scale image-text pairs, improves zero-shot classification and multilingual benchmarks without system-level confounding factors.
  MetaCLIP 2, trained on worldwide web-scale image-text pairs, improves zero-shot classification and multilingual benchmarks without system-level confounding factors.
Ligne 17 : Ligne 17 :


[https://huggingface.co/papers/2507.22062  Source : huggingface]
[https://huggingface.co/papers/2507.22062  Source : huggingface]
[https://vladbogo.substack.com/p/metaclip-2-a-worldwide-scaling-recipe  Source : substack]




[[Catégorie:vocabulary]]
[[Catégorie:vocabulary]]

Dernière version du 12 août 2025 à 09:55

en construction

Définition

XXXXXXXXX

Français

MetaCLIP

Anglais

MetaCLIP

The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonstrates that with proper data curation, metadata construction, and training framework design, English and non-English data can actually benefit each other mutually.

MetaCLIP 2, trained on worldwide web-scale image-text pairs, improves zero-shot classification and multilingual benchmarks without system-level confounding factors.

Source

Source : huggingface

Source : substack

Contributeurs: wiki