56 24 9

Dhaval Patel

DhavalPatel

dhaval-patel-2b287033

AI & ML interests

None yet

Recent Activity

new activity 2 days ago

ibm-research/AssetOpsBench:Update data/scenarios/all_utterance.jsonl

upvoted an article 3 days ago

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

upvoted a paper 3 days ago

Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows

View all activity

Organizations

New activity in ibm-research/AssetOpsBench 2 days ago

Update data/scenarios/all_utterance.jsonl

#12 opened 2 days ago by

shuxinl

upvoted an article 3 days ago

Article

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ibm-research

•

3 days ago

• 12

upvoted a paper 3 days ago

Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows

Paper • 2605.24219 • Published 5 days ago • 7

submitted a paper to Daily Papers 3 days ago

Beyond Final Answers: Auditing Trajectory-Level Hallucinations in Multi-Agent Industrial Workflows

Paper • 2605.24219 • Published 5 days ago • 7

commented 2 papers 4 days ago

Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge

Paper • 2605.08518 • Published 23 days ago • 11 •

Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

Paper • 2605.20630 • Published 11 days ago • 12 •

authored 3 papers 9 days ago

DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules

Paper • 2605.08614 • Published 22 days ago • 7

Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds

Paper • 2605.18827 • Published 19 days ago • 7

Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

Paper • 2605.20630 • Published 11 days ago • 12

upvoted a paper 9 days ago

Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

Paper • 2605.20630 • Published 11 days ago • 12

submitted a paper to Daily Papers 9 days ago

Evaluating Temporal Semantic Caching and Workflow Optimization in Agentic Plan-Execute Pipelines

Paper • 2605.20630 • Published 11 days ago • 12

submitted a paper to Daily Papers 10 days ago

Code-Guided Reasoning for Small Language Models: Evaluating Executable MCQA Scaffolds

Paper • 2605.18827 • Published 19 days ago • 7

authored 5 papers 12 days ago

SPIRAL: Symbolic LLM Planning via Grounded and Reflective Search

Paper • 2512.23167 • Published Dec 29, 2025 • 1

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Paper • 2604.23446 • Published Apr 25 • 4

MCP-Cosmos: World Model-Augmented Agents for Complex Task Execution in MCP Environments

Paper • 2605.09131 • Published 22 days ago • 57

Results and Retrospective Analysis of the CODS 2025 AssetOpsBench Challenge

Paper • 2605.08518 • Published 23 days ago • 11

SPIN: Structural LLM Planning via Iterative Navigation for Industrial Tasks

Paper • 2605.14051 • Published 18 days ago • 1

upvoted a paper 12 days ago

DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules

Paper • 2605.08614 • Published 22 days ago • 7

submitted a paper to Daily Papers 12 days ago

DiagnosticIQ: A Benchmark for LLM-Based Industrial Maintenance Action Recommendation from Symbolic Rules

Paper • 2605.08614 • Published 22 days ago • 7

liked a dataset 15 days ago

rohithkanathur/assetopsbench-transformer-scenarios-dataset

Viewer • Updated 22 days ago • 50 • 90 • 1

Dhaval Patel

AI & ML interests

Recent Activity

Organizations

DhavalPatel's activity

Update data/scenarios/all_utterance.jsonl

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM