« WEB-SHEPHERD » : différence entre les versions


Aucun résumé des modifications
Aucun résumé des modifications
Ligne 14 : Ligne 14 :
  WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist.
  WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist.
   
   
== Sources ==
[https://arxiv.org/abs/2505.15277  Source : arxiv]


 
[https://chapinindustries.com/2025/05/31/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency/  Source : Chapin Industries]
== Source ==
[https://huggingface.co/papers/2505.15277  Source : huggingface]




[[Catégorie:vocabulary]]
[[Catégorie:vocabulary]]

Version du 3 juin 2025 à 19:04

en construction

Définition

XXXXXXXXX

Français

XXXXXXXXX

Anglais

WEB-SHEPHERD

The first process reward model (PRM) specifically designed for web navigation tasks. It addresses the challenges of evaluating web agent trajectories at a step-by-step level, which is crucial for improving agent performance in long-horizon web tasks.

WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist.

Sources

Source : arxiv

Source : Chapin Industries

Contributeurs: Arianne , wiki