Agent Runs / Eval
Lumi · IT market · demo workspace
LM
AgentOps

Agent Runs / Eval

Every agent run is observable, validated against a schema and scored on a rubric — with a human in the loop where it counts.

Total runs
8
this cycle
Succeeded
6
schema-valid output
Needs review
1
awaiting a human
Avg eval score
83/100
6 evaluated

Agent Runs

Tool calls, validation, latency, tokens and cost per run

8 runs
AgentTaskStatusValidationPromptDurationTokensCostWhen
BuyerPersonaAgentCluster PersonasSucceededPassedBuyerPersonaAgent@v1.01.8s4,200US$0.04today
CreativeStrategyAgentGenerate AnglesSucceededPassedCreativeStrategyAgent@v2.12.1s4,850US$0.05yesterday
CopywritingAgentWrite HooksSucceededNeeds ReviewCopywritingAgent@v3.22.4s5,500US$0.062 days ago
AdsManagerAgentDraft CampaignNeeds ReviewNeeds ReviewAdsManagerAgent@v1.32.8s6,150US$0.083 days ago
ComplianceAgentCheck ClaimsSucceededPassedComplianceAgent@v2.43.1s6,800US$0.094 days ago
AnalyticsAgentExtract LearningsSucceededPassedAnalyticsAgent@v3.53.4s7,450US$0.105 days ago
TrendRadarAgentScan TrendsRunningTrendRadarAgent@v1.63.7s8,100US$0.11today
WarRoomAgentDaily RecommendationsSucceededPassedWarRoomAgent@v2.74.0s8,750US$0.12yesterday

Evaluations

Rubric-scored output with human-edit tracking

BuyerPersonaAgent
Human-edited
Score74/100
Schema Validity9/10
Grounding7/10
Actionability7/10

Human tightened hooks, kept structure.

CreativeStrategyAgent
As-is
Score79/100
Schema Validity10/10
Grounding8/10
Actionability8/10

Accepted as-is.

CopywritingAgent
Human-edited
Score84/100
Schema Validity9/10
Grounding9/10
Actionability9/10

Human tightened hooks, kept structure.

ComplianceAgent
As-is
Score89/100
Schema Validity10/10
Grounding7/10
Actionability7/10

Accepted as-is.

AnalyticsAgent
Human-edited
Score94/100
Schema Validity9/10
Grounding8/10
Actionability8/10

Human tightened hooks, kept structure.

WarRoomAgent
As-is
Score77/100
Schema Validity10/10
Grounding9/10
Actionability9/10

Accepted as-is.

Agent Roster

25 specialised agents — 8 active this cycle

MarketResearchAgent
Idle
VoiceOfCustomerAgent
Idle
BrandStrategyAgent
Idle
ProductMarketingAgent
Idle
BlueOceanAgent
Idle
BuyerPersonaAgent
Active
OfferAgent
Idle
ExperimentationAgent
Idle
CreativeStrategyAgent
Active
CopywritingAgent
Active
ContentStrategyAgent
Idle
SocialOpsAgent
Idle
AIInfluencerAgent
Idle
AdsManagerAgent
Active
CommunityAgent
Idle
FunnelAgent
Idle
LocalizationAgent
Idle
CompetitiveRadarAgent
Idle
TrendRadarAgent
Active
AnalyticsAgent
Active
AttributionAgent
Idle
CostControlAgent
Idle
ComplianceAgent
Active
RightsAgent
Idle
WarRoomAgent
Active