« Ovis » : différence entre les versions


(Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == '''Ovis 2,5''' == Anglais == '''Ovis 2,5''' An advanced multimodal large language model designed to process images at their native resolutions while incorporating reasoning capabilities. The model addresses two key limitations in current vision-language systems: the degradation caused by fixed-resolution image processing and the lack of reflective reasoning beyond simple chain-of-thought approac... »)
 
Aucun résumé des modifications
Ligne 2 : Ligne 2 :


== Définition ==
== Définition ==
XXXXXXXXX
Ovis (Open VISion) est une nouvelle architecture de '''[[grand modèle de langues multimodal]]''' à grande échelle conçue pour aligner structurellement les '''[[Représentation sémantique distributionnelle compacte|représentations sémantiques distributionnelles]]''' visuelles et textuelles.


== Français ==
== Français ==
'''Ovis 2,5'''
'''Ovis'''


== Anglais ==
== Anglais ==
'''Ovis 2,5'''
'''Ovis'''


An advanced multimodal large language model designed to process images at their native resolutions while incorporating reasoning capabilities. The model addresses two key limitations in current vision-language systems: the degradation caused by fixed-resolution image processing and the lack of reflective reasoning beyond simple chain-of-thought approaches.
''Ovis (Open VISion) is a novel Multimodal Large Language Model (MLLM) architecture designed to structurally align visual and textual embeddings.''
By eliminating the limitations of fixed-resolution image processing and incorporating self-corrective reasoning, Ovis2.5 achieves substantial improvements over previous models while maintaining efficiency through optimized training infrastructure.


== Source ==
== Sources ==
[https://github.com/AIDC-AI/Ovis  Source : GitHub]


[https://huggingface.co/papers/2508.11737  Source : huggingface]
[https://huggingface.co/papers/2508.11737  Source : huggingface]

Version du 6 octobre 2025 à 12:47

en construction

Définition

Ovis (Open VISion) est une nouvelle architecture de grand modèle de langues multimodal à grande échelle conçue pour aligner structurellement les représentations sémantiques distributionnelles visuelles et textuelles.

Français

Ovis

Anglais

Ovis

Ovis (Open VISion) is a novel Multimodal Large Language Model (MLLM) architecture designed to structurally align visual and textual embeddings.

Sources

Source : GitHub

Source : huggingface

Contributeurs: Arianne Arel, wiki