Project Lanterna

Detecting and cataloguing unknown JA4 TLS fingerprints from autonomous AI agents.

Illuminating what arrives from unknown waters.

3,562
TLS Signatures
149
Unique JA4
72
Novel Fingerprints
14
AI Bots Identified

First Results: AI Agent JA4 Fingerprint Discovery

Capture period: 3–4 April 2026 (31 hours)  |  Honeypot domains: counteragent.io, gptplugins.io, projectlanterna.com

Key Finding

ClaudeBot (Anthropic) and GPTBot (OpenAI) share the same JA4 TLS fingerprint:

t13d1011h2_61a7ad8aa9b6_3fcd1a44f3e3

Both crawlers use HTTP/2.0 and produce identical TLS ClientHello parameters — same cipher suites, same extensions, same ordering. This strongly suggests they share the same underlying HTTP client library. Despite being competitors, their crawlers are indistinguishable at the TLS layer.

OAI-SearchBot (ChatGPT Search) uses a different JA4 fingerprint, confirming it’s a separate client implementation within OpenAI.

AI Bot Fingerprint Catalogue

BotOperatorCategoryJA4sHTTPKey Insight
ClaudeBotAnthropicAI Crawler1H2Single consistent JA4 across all IPs
GPTBotOpenAIAI Crawler1H2Same JA4 as ClaudeBot — shared TLS library
OAI-SearchBotOpenAIAI Search2H2Different JA4 from GPTBot, separate client
CensysInspectCensys/GoogleScanner8H1Cycles TLS 1.0–1.3 + QUIC
okhttpSquareHTTP Library16H216 TLS variants, 22 distributed IPs
HeadlessChromeChromiumAutomation9H1/H2Each version (138–145) has distinct JA4
LeakIX l9scanLeakIXVuln Scanner16H1Probes .env, trace.axd, GraphQL
curlcurl projectCLI Tool14H1/H2Shares some JA4s with l9scan
Python aiohttpaiohttpHTTP Library5H1Credential scanner traffic (.env probes)
Go-http-clientGo stdlibHTTP Library4H1/H2Often paired with HeadlessChrome
InternetMeasurementAcademicResearch12H112 variants from 1 IP, tests every TLS config
MJ12botMajestic SEOSEO Crawler2H1Reads robots.txt then .well-known
Python-urllibPython stdlibHTTP Library1H1Clean single fingerprint
DalvikAndroidRuntime1H1ZTE device, TLS 1.2 only

Scanner Behaviour Patterns

TLS Version Cycling

CensysInspect, LeakIX, okhttp, and InternetMeasurement produce multiple JA4 variants by connecting with different TLS versions (1.0, 1.1, 1.2, 1.3, QUIC). A single scanner can produce 8–16 distinct fingerprints from one IP.

UA Rotation Defeated by JA4

The most active novel fingerprint rotates between 3 different Chrome User-Agent strings across 10 IPs. The JA4 remains constant — proving JA4 fingerprinting sees through UA spoofing.

Credential Scanning

Python aiohttp traffic probes /.env, /.env.local, /.env.prod, /config/aws.yml — hunting for exposed credentials. JA4: t12d120700_d34a8e72043a_036209cd1ead

Active Exploitation

One novel fingerprint probed /@fs/etc/passwd and /.git/config — targeting Vite dev server path traversal (CVE-2025-30208). Spoofed 3 browser UAs from a single TLS fingerprint.

Traffic Summary

DomainRequestsShareStrategy
gptplugins.io~3,80049%Graveyard reclaim (dead ChatGPT plugin directory)
projectlanterna.com~2,30029%Project homepage
counteragent.io~1,70022%Hallucination trap (LLM token compound)

The graveyard domain attracted 2.2x more traffic than the hallucination trap, suggesting domains in LLM training data generate more organic agent traffic than novel domains.

Methodology

Honeypot Design

Two domains with distinct strategies:

Lure endpoints: ai-plugin.json, agent.json, openid-configuration, OpenAPI spec, fake REST API, robots.txt, sitemap.xml. Client-side JS beacon collecting 40+ browser signals.

Dataset

92 JA4-to-application mappings across 14 identified bots. Available for download: