GitHub and Papers With CodeFavored runnable, maintained code. Rejected social demos without setup instructions or recent source evidence.
arXivFlagged Workflow-to-Skill and tool-call uncertainty as strategically relevant, but kept them on watchlist until code is verified.
Product Hunt, X, and YouTubeWarned that agent launches are often wrapper-heavy. No live platform verification was available, so hype risk was penalized.
HN and RedditHN provided early signals, Reddit was not live-verified. The credibility critique focused on sandboxing, auditability, and failure modes.
Newsletter and blog scoutPushed the strategic stack: access, orchestration, knowledge, and evals. This shaped the final ranking.
Final council rulingMove beyond generic chatbots. Build governed, tool-connected workflows and prove them with evals before expanding.