« WEB-SHEPHERD » : différence entre les versions

Version du 23 mai 2025 à 11:25

en construction

Définition

XXXXXXXXX

Français

XXXXXXXXX

Anglais

WEB-SHEPHERD

The first process reward model (PRM) specifically designed for web navigation tasks. It addresses the challenges of evaluating web agent trajectories at a step-by-step level, which is crucial for improving agent performance in long-horizon web tasks.

WEB-SHEPHERD is designed as a process reward model that evaluates web navigation trajectories at each step. The method works in two main stages: checklist generation and reward modeling with the checklist.

Source

Source : huggingface

@@ Ligne 17 : / Ligne 17 : @@
 == Source ==
-[https://huggingface.co/papers/2505.15277?utm_source=substack&utm_medium=email   Source : huggingface]
+[https://huggingface.co/papers/2505.15277  Source : huggingface]
 [[Catégorie:vocabulary]]

« WEB-SHEPHERD » : différence entre les versions

Version du 23 mai 2025 à 11:25

en construction

Définition

Français

Anglais

Source

« WEB-SHEPHERD » : différence entre les versions