All projects

cre-verify-me

VerifyMe Oracle is a decentralized fact-verification oracle that uses 3 independent LLMs (OpenAI, Gemini, Claude) with 2-of-3 consensus via Chainlink CRE to verify any claim posted on Moltbook (submolt /m/verify-me-test).

CRE & AI

What it is

Track: Autonomous agents + CRE and AI

What it is:
 VerifyMe Oracle is a fully autonomous, decentralized fact-verification system built on Chainlink CRE for the Convergence Hackathon 2026 (Agents Track). It applies the same
 multi-source consensus principle that powers Chainlink price feeds — multiple independent sources plus deterministic aggregation — but instead of price data, it verifies
 natural-language factual claims.

 An OpenClaw autonomous agent runs a scheduled cron job every 30 minutes, scanning the verify-me-test submolt on Moltbook (a social network for AI agents) for new posts
 containing factual claims. Each run processes at most one claim to stay within rate limits and resource budgets. When a new claim is detected from another agent (the cron
 explicitly skips self-authored posts to avoid verification loops), the agent extracts the verifiable statement, performs web research to gather supporting evidence, and
 triggers a Chainlink CRE workflow. The workflow queries three independent LLM providers — OpenAI gpt-4o-mini, Google Gemini 2.0 Flash, and Anthropic Claude 3.5 Haiku — with
  identical prompts containing the claim and the gathered evidence. Each model independently returns a verdict (TRUE, FALSE, or UNVERIFIABLE) along with a confidence score.
 The workflow then computes a 2-of-3 majority decision: if at least two models agree, their verdict becomes the final answer and their confidence scores are averaged. If no
 majority exists, the claim is marked UNVERIFIABLE and flagged as rejected. The final result is ABI-encoded, cryptographically signed through CRE's report pipeline, and
 written onchain to a custom VerifyMeConsumer contract on Sepolia via Chainlink's Forwarder. Once settled, the agent reads the onchain result and delivers a detailed
 verification comment back to the original Moltbook post — including the verdict, confidence percentage, model agreement ratio, reasoning, source citations, and the Sepolia
 transaction hash as tamper-proof evidence. The run status is then announced via Telegram to the human operator.

 The entire loop — from claim detection through evidence gathering, multi-LLM verification, onchain settlement, and public response — runs autonomously without human
 intervention, repeating every 30 minutes in isolated sessions to catch new claims as they appear. The cron job (verifyme-test-scan) runs in isolated session mode, meaning
 each execution gets a clean context with no state leakage between runs, and sends a short no-op status if there are no new claims to process.

 What problem it solves:
 Autonomous agents are increasingly making decisions based on factual claims they encounter — whether in social feeds, market signals, or cross-agent communication. But
 trusting a single LLM to verify facts is fragile: LLMs hallucinate, generating confident but factually incorrect responses. A single model might say "yes, NASA confirmed
 liquid water on Mars in 2025" with 90% confidence when the claim is actually false. When these hallucinated signals feed downstream agents or smart contracts with financial
  consequences, the result is misinformation propagation or fund losses.

 VerifyMe Oracle solves this by requiring cross-provider agreement. Three LLMs from three different companies (OpenAI, Google, Anthropic), trained on different data with
 different architectures and different failure modes, must independently evaluate the same claim against the same evidence. A false answer would need to fool at least two
 out of three diverse models simultaneously — which is exponentially harder than fooling one. The DON-level execution ensures that the aggregation itself is deterministic
 and tamper-resistant (not controlled by any single node), and the onchain settlement provides an immutable, publicly auditable record that any downstream agent or contract
 can trust without needing to re-verify the claim itself.

 How it works, step by step:

 1. Periodic claim monitoring via OpenClaw cron — An OpenClaw cron job (verifyme-test-scan) fires every 30 minutes in isolated session mode. Each run fetches recent posts
 from the verify-me-test submolt via the Moltbook API (GET /api/v1/posts?submolt=verify-me-test&sort=new). It filters out posts authored by verifymeoracle (its own account)
 to prevent self-verification loops, and identifies posts it hasn't already processed. At most one claim is processed per run to respect Moltbook's new-agent rate limits. If
  nothing needs verification, the agent sends a short no-op status and the cron reports back via Telegram.
 2. Claim extraction — For the selected post, the agent analyzes the content to identify a verifiable factual statement. It uses heuristics to distinguish facts ("NASA
 confirmed X in 2025") from opinions ("I think AI is overhyped") or questions ("Is Bitcoin a good investment?"). If no verifiable claim is found, the agent posts a comment
 explaining why verification isn't applicable, and skips CRE invocation entirely.
 3. Evidence collection — The agent performs 2-3 targeted web searches per claim to gather supporting or refuting evidence from authoritative sources. This research happens
 outside the CRE workflow to preserve the workflow's limited HTTP budget for the core verification task.
 4. CRE workflow trigger — The agent writes the query, evidence, and category into the workflow's config and executes cre workflow simulate --broadcast, which compiles the
 TypeScript workflow to WASM and runs it in a simulated DON environment with real HTTP calls and real onchain writes.
 5. Three-LLM evaluation — Inside the CRE workflow, three parallel HTTP calls query each LLM provider with an identical fact-verification prompt. All models receive the same
  claim text and evidence, use temperature 0 for maximum determinism, and are instructed to return a strict JSON response with answer (TRUE/FALSE/UNVERIFIABLE), confidence
 (0-100), and keyFacts.
 6. DON-level consensus aggregation — ConsensusAggregationByFields ensures Byzantine fault tolerance across DON nodes: identical() is applied to verdict fields (all nodes
 must agree on what each model said) and median() to confidence fields (robust to outliers).
 7. 2-of-3 majority vote — The workflow counts votes: if two or more models agree on a verdict, that becomes the final answer. The final confidence is the average of
 agreeing models' scores. If fewer than two models responded successfully, or if no majority exists, the claim is marked UNVERIFIABLE with rejected: true.
 8. Onchain settlement — The result is ABI-encoded as (bytes32 queryHash, string answer, uint8 confidence, uint8 modelCount, bool rejected), signed via CRE's report pipeline
  (ECDSA + Keccak256), and written to the VerifyMeConsumer contract on Sepolia through EVMClient.writeReport. The contract validates the verdict domain and confidence bounds
  before storing.
 9. Response publication — The agent parses the CRE simulation output, extracts the transaction hash and consensus result, and posts a structured verification comment back
 to the original Moltbook post. The comment includes the verdict, confidence, model count (e.g., "3/3 agreed"), evidence sources, and a direct link to the Sepolia
 transaction on Etherscan.
 10. Telegram delivery and next cycle — The cron job's delivery mode announces the run result to the operator via Telegram (status: ok/error, duration). The scheduler then
 queues the next run in 30 minutes. The cycle repeats indefinitely.

How it Works

The system has four main layers, each built with different technology suited to its role:

 1. CRE Workflow (verify-me-workflow/main.ts, ~400 lines, TypeScript compiled to WASM):
 The core verification engine runs on Chainlink's CRE infrastructure. Written in TypeScript using @chainlink/cre-sdk v1.1.3, it compiles to WASM via QuickJS/Javy for
 execution on DON nodes. The workflow handles secret retrieval (runtime.getSecret for API keys), parallel HTTP calls to three LLM providers, robust JSON parsing with regex
 extraction (LLMs don't always return clean JSON), field-based DON consensus aggregation (ConsensusAggregationByFields with identical for verdicts and median for
 confidence), ABI encoding via viem, signed report generation (runtime.report with ECDSA + Keccak256), and onchain write (EVMClient.writeReport). A flat struct pattern
 (ThreeLLMResults) is used instead of nested objects because CRE's field aggregation requires top-level fields. A closure pattern (makeQueryLLMs) safely injects secrets into
  the HTTP call logic. Dependencies are managed with Bun.

 2. Smart Contracts (contracts/, Solidity ^0.8.24):
 VerifyMeConsumer inherits from Chainlink's ReceiverTemplate (which implements the IReceiver interface and validates the Forwarder address). On receiving a CRE report via
 onReport(metadata, report), it ABI-decodes the payload, validates that the answer is exactly one of the three allowed values (TRUE, FALSE, UNVERIFIABLE) and that confidence
  is within 0-100, stores the result in a mapping keyed by queryHash, and emits a ClaimVerified event. Deployed on Sepolia using a custom viem-based script
 (deploy-verifyme-consumer.mjs) — the project does not use Hardhat.

 3. OpenClaw Agent with Scheduled Cron Job (skills/, OpenClaw runtime):
 The autonomous agent runs on OpenClaw with a cron job (verifyme-test-scan) configured to fire every 30 minutes (everyMs: 1800000). Each execution runs in isolated session
 mode — a clean context with no state carried over between runs — and invokes two skills. The verifyme-oracle-skill implements the verification pipeline: fetch posts from
 Moltbook's API, filter self-authored content, extract verifiable claims, gather web evidence via web_search (2-3 queries per claim), invoke the CRE workflow by writing
 config and running cre workflow simulate --broadcast, parse the simulation output for tx hash and consensus result, and post the verification comment. The moltbook-skill
 handles Moltbook API integration: authentication, posting, commenting, rate limit compliance (1 post/30min, 1 comment/20sec, 50 comments/day), and solving the platform's
 anti-spam verification challenges. Run results are delivered to the operator via Telegram (delivery.mode: "announce", delivery.channel: "telegram"). The cron processes at
 most 1 claim per run to respect new-agent limits and keep execution duration manageable (~14 seconds per run based on observed metrics).

 4. Configuration and Infrastructure:
 project.yaml defines Sepolia RPC endpoints and target settings. secrets.yaml maps CRE secret IDs to environment variables (OPENAI_API_KEY -> OPENAI_API_KEY_ALL, etc.).
 config.staging.json holds the current claim, evidence, category, consumer address, and gas limit — updated dynamically by the agent before each CRE invocation. The workflow
is triggered by the agent's 30-minute scheduled job.

Links

Created by

  • Diego Valdeolmillos