Contributions de Pitpitt


Rechercher des contributionsaffichermasquer
⧼contribs-top⧽
⧼contribs-date⧽

15 août 2025

  • 08:3515 août 2025 à 08:35 diff hist +10 Catégorie:PublicationAucun résumé des modifications
  • 08:3015 août 2025 à 08:30 diff hist +18 Catégorie:PublicationAucun résumé des modifications
  • 08:2415 août 2025 à 08:24 diff hist +1 013 N R-ZeroPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' R-Zero''' == Anglais == '''R-Zero''' A framework that enables large language models to improve their reasoning abilities without requiring any human-labeled training data. The method creates a self-evolving system where two AI models work together - one generates challenging questions while the other learns to solve them, creating an autonomous learning loop that starts from scratch. R-Z... »
  • 08:2315 août 2025 à 08:23 diff hist +617 N GLMPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''GLM-4.5''' '''GLM''' GLM-4.5 represents a significant advancement in creating unified AI models that excel across multiple domains. By combining efficient MoE architecture, multi-stage training, and expert model iteration, the paper demonstrates that a single model can achieve strong performance in agentic tasks, reasoning, and coding without requiring the mas... »
  • 08:2015 août 2025 à 08:20 diff hist +531 N Omni-EffectsPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''Omni-Effects'''à A unified framework for generating customized visual effects (VFX) in videos. Unlike existing methods that require separate models for each effect, this approach can generate multiple visual effects simultaneously while providing precise spatial control over where each effect appears in the video. == Source == [https://huggingface.co/papers/... »
  • 08:1915 août 2025 à 08:19 diff hist +528 N WebWatcherPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''WebWatcher''' A multimodal AI agent designed for deep research tasks that handles both visual and textual understanding. While existing web agents excel at text-based research, they struggle with real-world scenarios that involve visual information like scientific diagrams, charts, or visually rich web interfaces. == Source == [https://huggingface.co/papers/2... »

12 août 2025

  • 08:5512 août 2025 à 08:55 diff hist +93 MetaCLIPAucun résumé des modifications actuelle
  • 08:5212 août 2025 à 08:52 diff hist +1 MetaCLIPAucun résumé des modifications
  • 08:5212 août 2025 à 08:52 diff hist +905 N MetaCLIPPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' MetaCLIP''' == Anglais == '''MetaCLIP''' The first recipe for training CLIP models from scratch on worldwide web-scale image-text pairs spanning 300+ languages. The work addresses the challenge of scaling CLIP beyond English-only data while avoiding the "curse of multilinguality" - where multilingual models perform worse on English tasks than their English-only counterparts. The paper demonst... »
  • 08:5012 août 2025 à 08:50 diff hist +672 N LongViePage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' LongVie''' == Anglais == '''LongVie''' A framework for generating controllable long videos lasting up to one minute. The method addresses key challenges in extending video generation beyond short clips, specifically temporal inconsistency and visual degradation that occur when generating longer sequences. The paper proposes a multi-modal control approach that combines dense and sparse guidan... »
  • 08:4912 août 2025 à 08:49 diff hist +689 N PixNerdPage créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' PixNerd''' == Anglais == '''PixNerd''' A novel approach to image generation that operates directly in pixel space rather than compressed latent representations. The method addresses limitations of current diffusion models that rely on variational autoencoders (VAEs), which can introduce artifacts and require complex two-stage training. By combining diffusion transformers with neural field re... »

9 août 2025

6 août 2025

5 août 2025