WEB-SHEPHERD


Révision datée du 23 mai 2025 à 11:04 par Pitpitt (discussion | contributions) (Page créée avec « ==en construction== == Définition == XXXXXXXXX == Français == ''' XXXXXXXXX ''' == Anglais == '''WEB-SHEPHERD''' he first process reward model (PRM) specifically designed for web navigation tasks. It addresses the challenges of evaluating web agent trajectories at a step-by-step level, which is crucial for improving agent performance in long-horizon web tasks. WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at... »)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)

en construction

Définition

XXXXXXXXX

Français

XXXXXXXXX

Anglais

WEB-SHEPHERD

he first process reward model (PRM) specifically designed for web navigation tasks. It addresses the challenges of evaluating web agent trajectories at a step-by-step level, which is crucial for improving agent performance in long-horizon web tasks.
WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist.


Source

Source : huggingface

Contributeurs: Arianne , wiki