« WEB-SHEPHERD » : différence entre les versions
Aucun résumé des modifications |
Aucun résumé des modifications |
||
Ligne 14 : | Ligne 14 : | ||
WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist. | WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist. | ||
== Sources == | |||
[https://arxiv.org/abs/2505.15277 Source : arxiv] | |||
[https://chapinindustries.com/2025/05/31/this-ai-paper-introduces-web-shepherd-a-process-reward-model-for-web-agents-with-40k-dataset-and-10x-cost-efficiency/ Source : Chapin Industries] | |||
[https:// | |||
[[Catégorie:vocabulary]] | [[Catégorie:vocabulary]] |
Version du 3 juin 2025 à 19:04
en construction
Définition
XXXXXXXXX
Français
XXXXXXXXX
Anglais
WEB-SHEPHERD
The first process reward model (PRM) specifically designed for web navigation tasks. It addresses the challenges of evaluating web agent trajectories at a step-by-step level, which is crucial for improving agent performance in long-horizon web tasks. WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist.
Sources
