Contributions de Pitpitt


Rechercher des contributionsaffichermasquer
⧼contribs-top⧽
⧼contribs-date⧽

20 août 2025

18 août 2025

  • 21:5218 août 2025 à 21:52 diff hist −28 Architecture à vecteurs sémantiques jointsAucun résumé des modifications
  • 21:5118 août 2025 à 21:51 diff hist −28 Attention éclairAucun résumé des modifications
  • 21:5018 août 2025 à 21:50 diff hist +12 Normalisation du gradientAucun résumé des modifications actuelle
  • 09:3718 août 2025 à 09:37 diff hist +1 349 N DINOPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' DINO v3 ''' == Anglais == '''DINO v3''' A self-supervised model trained without the need for manual data annotations. The method leverages simple yet effective strategies to scale both dataset and model size, achieving state-of-the-art performance across a broad range of vision tasks without requiring fine-tuning. The paper presents a versatile vision foundation model that significantly outp... » actuelle
  • 09:3618 août 2025 à 09:36 diff hist +1 101 N Mol-R1Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' Mol-R1''' == Anglais == '''Mol-R1''' Mol-R1 framework enhances molecule discovery by improving reasoning performance and explainability through PRID and MoIA strategies. A framework that enhances the reasoning capabilities of large language models for molecule discovery. The work addresses the challenge of generating molecular structures from text descriptions while providing clear, step-by... »

15 août 2025

  • 09:3515 août 2025 à 09:35 diff hist +10 Catégorie:PublicationAucun résumé des modifications
  • 09:3015 août 2025 à 09:30 diff hist +18 Catégorie:PublicationAucun résumé des modifications
  • 09:2415 août 2025 à 09:24 diff hist +1 013 N R-ZeroPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' R-Zero''' == Anglais == '''R-Zero''' A framework that enables large language models to improve their reasoning abilities without requiring any human-labeled training data. The method creates a self-evolving system where two AI models work together - one generates challenging questions while the other learns to solve them, creating an autonomous learning loop that starts from scratch. R-Z... »
  • 09:2315 août 2025 à 09:23 diff hist +617 N GLMPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''GLM-4.5''' '''GLM''' GLM-4.5 represents a significant advancement in creating unified AI models that excel across multiple domains. By combining efficient MoE architecture, multi-stage training, and expert model iteration, the paper demonstrates that a single model can achieve strong performance in agentic tasks, reasoning, and coding without requiring the mas... »
  • 09:2015 août 2025 à 09:20 diff hist +531 N Omni-EffectsPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''Omni-Effects'''à A unified framework for generating customized visual effects (VFX) in videos. Unlike existing methods that require separate models for each effect, this approach can generate multiple visual effects simultaneously while providing precise spatial control over where each effect appears in the video. == Source == [https://huggingface.co/papers/... »
  • 09:1915 août 2025 à 09:19 diff hist +528 N WebWatcherPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''WebWatcher''' A multimodal AI agent designed for deep research tasks that handles both visual and textual understanding. While existing web agents excel at text-based research, they struggle with real-world scenarios that involve visual information like scientific diagrams, charts, or visually rich web interfaces. == Source == [https://huggingface.co/papers/2... »

12 août 2025

  • 09:5512 août 2025 à 09:55 diff hist +93 MetaCLIPAucun résumé des modifications actuelle
  • 09:5212 août 2025 à 09:52 diff hist +1 MetaCLIPAucun résumé des modifications
  • 09:5212 août 2025 à 09:52 diff hist +905 N MetaCLIPPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' MetaCLIP''' == Anglais == '''MetaCLIP''' The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonst... »
  • 09:5012 août 2025 à 09:50 diff hist +672 N LongViePage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' LongVie''' == Anglais == '''LongVie''' A framework for generating controllable long videos lasting up to one minute. The method addresses key challenges in extending video generation beyond short clips, specifically temporal inconsistency and visual degradation that occur when generating longer sequences. The paper proposes a multi-modal control approach that combines dense and sparse guidan... »
  • 09:4912 août 2025 à 09:49 diff hist +689 N PixNerdPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' PixNerd''' == Anglais == '''PixNerd''' A novel approach to image generation that operates directly in pixel space rather than compressed latent representations. The method addresses limitations of current diffusion models that rely on variational autoencoders (VAEs), which can introduce artifacts and require complex two-stage training. By combining diffusion transformers with neural field re... »

9 août 2025

6 août 2025

5 août 2025