Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -13,11 +13,11 @@ pinned: false
|
|
| 13 |
|
| 14 |
## CREW-Agents
|
| 15 |
|
| 16 |
-
| Agent | Occupation | Scale | What It Tests | Verifiers |
|
| 17 |
-
|-------|--------|-------|------------|--------|
|
| 18 |
-
| **[Fin Agent](link)** | Credit analyst | 2,610 tasks, 26K+ PDFs | Multiple document reasoning β taxonomy aware transaction categorization β Business P&L construction | Binary pass/fail |
|
| 19 |
-
| **[Enterprise Knowledge Agent](link)** |
|
| 20 |
-
| **[Front-end Agent](link)** | Senior Frontend engineer | 37 tasks, 147 expert preferences | Figma environment navigation β design system creation β build verification | Subjective on output preference |
|
| 21 |
|
| 22 |
## Leaderboard
|
| 23 |
|
|
|
|
| 13 |
|
| 14 |
## CREW-Agents
|
| 15 |
|
| 16 |
+
| Agent | Occupation | Complexity | Scale | What It Tests | Verifiers |
|
| 17 |
+
|-------|--------|-------|-------|------------|--------|
|
| 18 |
+
| **[Fin Agent](link)** | Credit analyst | 32+ expert hours | 2,610 tasks, 26K+ PDFs | Multiple document reasoning β taxonomy aware transaction categorization β Business P&L construction | Binary pass/fail |
|
| 19 |
+
| **[Enterprise Knowledge Agent](link)** | Senior business analyst | 16+ expert hours| 1,220 pitch-deck tasks, 45 video tasks, 279 preference pairs | Source faitfhulness β narrative arc based story-telling --> design coherenece| Precision, Recall on citation. Subjective on video preference |
|
| 20 |
+
| **[Front-end Agent](link)** | Senior Frontend engineer | 60-100 expert hours | 37 tasks, 147 expert preferences | Figma environment navigation β design system creation β build verification | Subjective on output preference |
|
| 21 |
|
| 22 |
## Leaderboard
|
| 23 |
|