Pepclaw

Overview

Pepclaw is a research swarm.

Pepclaw is an autonomous research swarm for nonclinical peptide discovery. It runs missions — bounded, single-question research jobs — and emits dossiers — buyer-safe, citation-anchored deliverables.

Every mission goes through a 5-layer DAG of agent pools. Layer 1 ingests evidence; layer 2 annotates and scouts novelty; layer 3 grades evidence A → X; layer 4 reasons + critiques; layer 5 synthesizes the dossier. The dossier never assumes more than the evidence allows.

Architecture

The 12 agent pools.

Literature Minerupstream · literature_miner
PubMed E-utilities. Pulls peer-reviewed papers with full PMID provenance and structured abstracts.
Sequence & Structureupstream · sequence_structure
UniProt + PDB + AlphaFold. Annotates peptide targets, structural confidence, lipidation tolerance.
Target & Pathwayupstream · target_pathway
OpenTargets GraphQL + Reactome. Maps targets to diseases and mechanism-of-action pathways.
Variant Linkerupstream · variant_linker
ChEMBL REST. Links targets to known small molecules and binding-assay endpoints.
ADMET Developabilityupstream · admet_developability
Heuristic ADMET scorecards: solubility, half-life, peptide developability, off-target risk.
Novelty Scoutreasoning · novelty_scout
Whitespace detector. Scores each finding for novelty against historical mission corpus.
Patent Competitivereasoning · patent_competitive
Patent landscape and freedom-to-operate surveillance via Lens.org-compatible signals.
Thesis Generatorreasoning · thesis_generator
Composes structured, falsifiable hypotheses with evidence-cited mechanism claims.
Evidence Graderreasoning · evidence_grader
Grades each finding A/B/C/D/X using study type, replication and reporting strength.
Red Teamreasoning · red_team
Three personas — Skeptic / Scientist / Senior Reviewer. Only Senior can issue a hard block.
Synthesizeroutput · synthesizer
Consolidates cross-pool findings and theses into the synthesis document.
Dossier Assembleroutput · dossier_assembler
Buyer-safe markdown dossier with PMID citations, hedged claims and full evidence chain.

Protocol

Commit / reveal — tamper-evident questions.

Before any agent runs, Pepclaw computes:

message = JSON.stringify({
  query: "...",
  target_class: "...",
  schema: "pepclaw.commit.v1",
  salt: <random 16 bytes hex>,
});

commit_hash = sha256(message);   // public, written to mission row immediately
commit_salt = <salt>;            // private until the mission completes

On completion, Pepclaw publishes the salt. Anyone can re-hash the original question + salt and verify the run was honest end-to-end. If a mission is aborted before completion, the salt remains sealed and the commit hash stays as a public commitment that no answer was ever delivered for that question.

Quality

Evidence grading — A through X.

Every finding is graded with an explicit rubric. The rubric is shipped with Pepclaw, not learned, and not opaque.

Grade	Meaning
A	Multiple independent peer-reviewed studies, replicated, with concordant readouts.
B	Single peer-reviewed study with rigorous methodology, or strong concordance from indirect sources.
C	Preprint, conference, or single-method evidence; plausible but not replicated.
D	Indirect inference, weak methodology, or fragile single-source claim.
X	Insufficient or contradictory evidence; cannot ground a thesis.

Data

Data sources — real, not mocked.

PubMed (NCBI E-utilities)
esearch + efetch, PMID-anchored citations
UniProt
Protein entries, taxonomy, function annotations
AlphaFold
Predicted structures and pLDDT confidence per residue
OpenTargets GraphQL
Target ↔ disease associations + therapeutic areas
ChEMBL REST
Targets, ligands, bioactivity priors
Reactome
Pathway membership and cross-references
Lens.org Patents
Whitespace and freedom-to-operate signals (pending)
ClinicalTrials.gov
Trial landscape (future)

Output

Dossier shape — buyer-safe markdown.

Every dossier follows the same skeleton, by construction:

## Question
<the original mission query, verbatim>

## Cross-pool consensus
- Literature ...
- Sequence/structure ...
- Target/pathway ...
- ChEMBL ligand prior ...

## Open questions
## Risks
## Recommended next steps

The dossier is deterministic given the upstream evidence. It does not invent claims, does not embed PMIDs that weren't retrieved, and never makes a human-use claim.

Reference

HTTP API.

POST
/api/missions
Start a mission. Body: { query, target_class?, depth?, budget_cents? }. Returns 202 + mission_id.
GET
/api/missions
List missions, latest 50.
GET
/api/missions/:id
Mission state: tasks, findings, theses, critiques, dossiers.
GET
/api/dashboard?mission_id=…
Aggregated dashboard payload (the same one /app uses).
GET
/api/stream?mission_id=…
Server-Sent Events feed of swarm events. Heartbeats every ~700ms.

Reasoning

kie.ai integration.

Pepclaw's reasoning agents — Thesis Generator, Red Team and Synthesizer — call POST https://api.kie.ai/codex/v1/responses with model gpt-5-4. Reasoning effort is configurable per call (low → xhigh).

POST https://api.kie.ai/codex/v1/responses
authorization: Bearer $KIE_API_KEY
content-type: application/json

{
  "model": "gpt-5-4",
  "stream": false,
  "input": [
    { "role": "user", "content": [{ "type": "input_text", "text": "..." }] }
  ],
  "reasoning": { "effort": "medium" }
}

If KIE_API_KEY is not set, Pepclaw transparently falls back to deterministic templated reasoning so the swarm still ships dossiers end-to-end. The mission ledger marks any fallback runs explicitly so replays can distinguish “model output” from “deterministic stub”.