Specialized AI for security research.

A production API tuned for APT analysis, reverse engineering, vulnerability triage, and Web3 audits. Domain-fine-tuned on top of an unrestricted base model with a security-grade RAG corpus.

Request Access Use with Claude Code →

Endpoint https://api.ai0day.com · Anthropic Messages API compatible · OpenAI Chat Completions compatible

01Capabilities

Each mode invokes a dedicated retrieval pipeline (BGE-M3 + BM25 hybrid over a curated security corpus) before generation. The model is a domain-tuned LoRA on top of GLM-5.1, served on H200 GPUs with FP8 quantization and 16K context.

APT Detection `mode: apt_detection`

Threat-actor TTPs, MITRE ATT&CK mapping, kill-chain reconstruction, lateral-movement detection, LOLBin hunting, C2 traffic patterns.

Reverse Engineering `mode: reverse_analysis`

Disassembly walkthroughs, packing identification, anti-debug bypasses, decompilation guidance, malware family attribution, custom obfuscator analysis.

Vulnerability Triage `mode: vuln_triage`

CVE analysis, CVSS breakdown, exploit conditions, PoC outlines, kernel and userspace memory-corruption review, mitigation chains.

Web3 Audit `mode: web3_audit`

Solidity audits (reentrancy, access control, oracle manipulation, flash-loan vectors), proxy upgrade safety, signature malleability, MEV/front-running surface.

02Getting Access

Access is invite-only. Both trial and paid keys are issued manually after a short conversation about your use case. We do this to keep the platform fast for actual security researchers and away from generic abuse.

Email [email protected] with: your name, organization, intended use case (red team / blue team / research / audit firm), and an estimate of your monthly request volume.
We respond within 24 hours with either a trial key or a paid-plan invoice.
You receive a key of the form sk-ai0day-… via email. Store it securely — keys are SHA-256 hashed at rest and cannot be recovered if lost.
Test with the curl snippet below. If anything is wrong, we revoke the key, fix it, and reissue.

curl https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer sk-ai0day-…" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{"role": "user", "content": "Audit reentrancy in withdraw() function."}],
    "mode": "web3_audit",
    "max_tokens": 800,
    "temperature": 0.0
  }'

Health check: GET https://api.ai0day.com/health returns {"gateway":"ok","vllm":"ok","rag":"ok"} when everything is up. Use it in monitoring before integration goes live.

03Pricing

One product, two simple tiers. We do not meter tokens. We meter access duration. This means you can deploy long-context RAG queries (up to 16K) without watching a per-token counter.

Trial

Free · 24 hours

For evaluation. One key per organization, single device.

1,000 requests over 24 hours
All 4 modes (apt / reverse / vuln / web3)
Anthropic + OpenAI compatible endpoints
Email support during trial window

Trial keys are time-limited and cannot be extended. Use the trial to validate, then upgrade.

Monthly

$10,000 / month / key

For production research workflows.

No monthly request cap (rate-limiter governed: 5 concurrent global, 2 concurrent per key)
Full 16K context, all 4 modes, FP8/H200 backing
Renewal reminder 5 days before expiry (daily email)
Compatible with Claude Code (Anthropic), OpenAI clients, and direct REST
Direct Slack / TG channel for ops issues

Onboarding: invoice → wire / crypto / card via Stripe → key issued same day.

Billing cycle is per key. We do not auto-charge; renewal is opt-in each month. If your key expires, requests return HTTP 401 key_expired until you renew.

04Use with Claude Code

Claude Code, Anthropic's CLI, can be redirected to AI0Day with two environment variables. Your existing Claude Code workflow (slash commands, MCP servers, hooks) continues to work — only the model and API endpoint change.

Anthropic Messages API

OpenAI-compatible

Direct REST

Set two environment variables. Claude Code sends every request to AI0Day instead of api.anthropic.com. The model is selected automatically by the gateway — you don’t have to choose.

# In ~/.zshrc / ~/.bashrc
export ANTHROPIC_BASE_URL="https://api.ai0day.com"
export ANTHROPIC_AUTH_TOKEN="sk-ai0day-…"

# Restart your shell, then verify:
claude --version
claude -p "Detect APT41 lateral movement signatures in a Linux env."

The gateway auto-detects which security mode to use (apt / reverse / vuln / web3) from the query content and routes to the LoRA-tuned model. Multi-turn conversation history is supported. Tool use (function calling) and token-by-token SSE streaming are on the roadmap.

If you prefer OpenAI client conventions (works with openai Python SDK, litellm, continue.dev, etc.):

export OPENAI_BASE_URL="https://api.ai0day.com/v1"
export OPENAI_API_KEY="sk-ai0day-…"

# Python
from openai import OpenAI
cli = OpenAI()
r = cli.chat.completions.create(
    model="ai0day",   # any non-empty string; gateway picks the right model
    messages=[{"role": "user", "content": "Triage CVE-2024-3400."}],
)
print(r.choices[0].message.content)

Mode auto-detected from query. To force a specific mode, append a suffix: model="ai0day-vuln_triage" (or -apt_detection / -reverse_analysis / -web3_audit).

For shell scripting, CI pipelines, or non-SDK environments:

curl https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer sk-ai0day-…" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Reverse engineer 0x55 0x48 0x89 0xe5 in x86_64 context."}
    ],
    "mode": "reverse_analysis",
    "max_tokens": 1024,
    "temperature": 0.0
  }'

Tip: The mode parameter is what triggers the security RAG retrieval pipeline. Without a mode, you get a generic LLM response. Always set mode to one of apt_detection / reverse_analysis / vuln_triage / web3_audit.

05Use Cases

Concrete workflows our users run in production. All examples use Claude Code with ANTHROPIC_BASE_URL pointing at AI0Day.

Reverse engineering a malware sample

$ claude
> /apt @./samples/loader.bin
> Disassemble the entry point and identify the unpacking routine.
> If you find a custom XOR loop, decode the embedded config block
> and tell me the C2 generation algorithm.

APT campaign attribution from telemetry

$ claude -p "Given these process-tree events from a Linux endpoint, \
which APT group's TTPs match best? Map to MITRE ATT&CK and \
suggest detection rules I can deploy in Falco."

Smart-contract audit before deploy

$ claude
> Audit contracts/Vault.sol and contracts/Strategy.sol.
> Flag every reentrancy path, oracle dependency, and access-control
> bypass. Output a severity-sorted finding list with line refs.

CVE triage in a CI pipeline

#!/bin/bash
# .github/workflows/triage.sh
curl -s https://api.ai0day.com/v1/chat \
  -H "Authorization: Bearer $AI0DAY_KEY" \
  -d "{\"messages\":[{\"role\":\"user\",\"content\":\"$CVE_DESC\"}],\
       \"mode\":\"vuln_triage\",\"max_tokens\":1500}" \
  | jq -r '.content' > triage_report.md

06FAQ

What model is behind AI0Day?

A LoRA-tuned variant of GLM-5.1 (abliterated base, FP8-quantized), running on 8×H200 with vLLM. The LoRA layer is rank-32, alpha-64, trained on a curated security corpus. The base model has the standard refusal layer disabled; the LoRA does not reintroduce it. Adversarial-set refusal rate is 96.7% (vs 93.3% for the base alone) — domain training nudges it slightly safer than the bare abliterated base.

How is my data handled?

Requests are processed in-memory only. The gateway logs request metadata (timestamp, key id, mode, latency, status) for billing and abuse detection. Request bodies and response content are not persisted. The RAG corpus is read-only public security data.

What's the latency?

P50 around 12s, P95 around 30s for typical RAG queries (rag_k=3, max_tokens=512). Long-context queries (rag_k=8, max_tokens=1024) are ~14–22s. The latency budget is dominated by H200 generation time, not network.

Can I get higher concurrency?

The default rate limit is 5 concurrent globally, 2 per key. For higher concurrency contact us — we provision a dedicated GPU pool starting at $50K/month.

Do you support function calling / tool use?

Not yet. The Anthropic and OpenAI compatible endpoints accept tools in the request body but ignore them in this release. Native tool use is on the roadmap.

What if a key gets compromised?

Email us. We disable it within minutes and reissue. Device fingerprints are tracked per key — if more than 3 distinct devices use the same key in 24 hours, we automatically flag it.

Specialized AI for security research.

01Capabilities

APT Detection mode: apt_detection

Reverse Engineering mode: reverse_analysis

Vulnerability Triage mode: vuln_triage

Web3 Audit mode: web3_audit

02Getting Access

03Pricing

Trial

Monthly

04Use with Claude Code

05Use Cases

Reverse engineering a malware sample

APT campaign attribution from telemetry

Smart-contract audit before deploy

CVE triage in a CI pipeline

06FAQ

What model is behind AI0Day?

How is my data handled?

What's the latency?

Can I get higher concurrency?

Do you support function calling / tool use?

What if a key gets compromised?

APT Detection `mode: apt_detection`

Reverse Engineering `mode: reverse_analysis`

Vulnerability Triage `mode: vuln_triage`

Web3 Audit `mode: web3_audit`