How We Built an AI Lead Qualifier in 2 Weeks

Manual lead qualification does not scale. Every SaaS company knows this intellectually, but most still do it — an SDR reads a form submission, Googles the company, makes a gut call, and either fires off a follow-up or moves to the next one. That process takes 5-10 minutes per lead and produces inconsistent results across reps.

We built an AI lead qualifier that scores inbound leads in under 2 seconds, categorizes them as hot, warm, or cold, and generates a personalized follow-up message — all before a human touches anything. This article is a technical walkthrough of how we built it, the architecture decisions that shaped it, and what the results look like in production.

The Problem: Manual Qualification Does Not Scale

Here is what lead qualification looked like before we built this system:

A lead submits a form on a client’s website
The submission lands in a CRM or email inbox
An SDR (or the founder, at smaller companies) opens it 1-24 hours later
They read the message, check the email domain, maybe look up the company on LinkedIn
They make a judgment call: follow up now, add to nurture, or ignore
They write a follow-up email from scratch or pick a template

Every step leaks value. The delay between submission and response averages 42 hours for B2B companies. By then, the prospect has contacted three competitors. The qualification criteria vary by rep, by mood, by day of the week. And the follow-up emails are either generic templates or time-consuming one-offs.

What we needed:

Score every lead within seconds of submission
Apply consistent criteria across all leads
Generate a personalized follow-up message, not a template
Route hot leads to CRM immediately
Allow different scoring criteria per client
Run at the edge with zero cold starts — leads do not wait

Architecture: Cloudflare Workers + D1 + Claude API

We chose a fully edge-native stack. The qualifier runs as a Cloudflare Worker, stores leads in D1 (SQLite at the edge), and calls the Claude API for scoring. No containers, no servers, no cold starts.

Lead submission (POST /leads)
    │
    ▼
┌──────────────────────────┐
│   Cloudflare Worker      │
│   (Hono framework)       │
│                          │
│  1. Validate input       │
│  2. Call Claude API      │
│  3. Parse score          │
│  4. Categorize           │
│  5. Persist to D1        │
│  6. Push to webhook      │
│  7. Enqueue for nurture  │
└──────────────────────────┘
    │           │           │
    ▼           ▼           ▼
   D1        Webhook     Queue
 (leads)    (CRM push)  (nurture)

Why Cloudflare Workers

Workers run in 300+ data centers worldwide. A lead submitted from Tokyo gets processed at the nearest edge node, not in a US-East data center. Average cold-start time is zero — Workers use V8 isolates, not containers.

For a lead qualifier, this matters because speed directly correlates with conversion. Research from InsideSales shows that responding within 5 minutes makes you 21x more likely to qualify the lead. Our system responds in under 2 seconds.

Why D1

D1 is Cloudflare’s edge database — SQLite that runs co-located with the Worker. No network round-trip to a remote Postgres instance. The schema is straightforward:

CREATE TABLE leads (
  id          TEXT PRIMARY KEY NOT NULL,
  name        TEXT NOT NULL,
  email       TEXT NOT NULL,
  company     TEXT NOT NULL,
  source      TEXT NOT NULL,
  message     TEXT NOT NULL,
  score       INTEGER NOT NULL,
  category    TEXT NOT NULL CHECK (category IN ('hot', 'warm', 'cold')),
  follow_up   TEXT NOT NULL,
  webhook_sent INTEGER NOT NULL DEFAULT 0,
  created_at  TEXT NOT NULL
);

CREATE INDEX idx_leads_category   ON leads (category);
CREATE INDEX idx_leads_created_at ON leads (created_at DESC);
CREATE INDEX idx_leads_email      ON leads (email);

One table, three indexes. The category check constraint enforces valid values at the database level. We index on category for filtered list queries, created_at for chronological ordering, and email for deduplication checks.

Why Claude API

We use claude-haiku-4-5-20251001 for scoring. Haiku is fast enough for real-time scoring (typically 300-500ms for this payload size) and accurate enough for the judgment calls involved in lead qualification. Opus would be overkill and too slow for sub-second response requirements.

The model receives the lead data plus optional client-specific scoring context and returns a structured JSON response. More on the scoring rubric in the next section.

The Scoring Rubric: Intent, Fit, and Contact Quality

The hardest part of building a lead qualifier is not the code. It is designing a scoring rubric that produces consistent, actionable scores across diverse lead inputs.

We settled on a 100-point rubric with three dimensions:

Dimension	Points	What It Measures
Intent clarity	0–40	How clearly does the message express a real business need and urgency?
Company fit	0–35	Does the company name/email domain suggest a relevant industry and appropriate size?
Contact quality	0–25	Is the contact information complete and professional?

Why These Weights

Intent clarity gets the most weight (40 points) because a highly motivated buyer at a mediocre-fit company converts better than a perfect-fit company with no urgency. The message is the strongest signal we have. A lead that writes “We need to automate our sales qualification by Q2, currently processing 500 leads/month manually” scores dramatically higher than “Just exploring options.”

Company fit gets 35 points. Email domain alone tells you a lot. A @acmecorp.com address signals a real company. A @gmail.com address could be a student or a decision-maker who uses personal email — the model has to make a contextual judgment. Combined with the company name, the model can estimate industry relevance and company size.

Contact quality gets 25 points. Full name, work email, complete form fields. A lead that fills out every field with detail is more engaged than one that types “hi” in the message and “test” for the company.

The System Prompt

The scorer operates with a structured system prompt that defines the rubric and output format:

You are a lead qualification specialist for an AI-first B2B agency.

Your job is to analyze inbound leads and score them based on
their likelihood to convert.

Scoring rubric (total 100 points):
- Intent clarity (0-40): How clearly does the message express
  a real business need and urgency?
- Company fit (0-35): Does the company name/email domain suggest
  a relevant industry and appropriate size?
- Contact quality (0-25): Is the contact information complete
  and professional (work email, full name)?

Return ONLY valid JSON:
{
  "score": <integer 0-100>,
  "summary": "<2-3 sentence qualification summary>",
  "follow_up": "<personalized follow-up message>"
}

The prompt is deliberately concise. We tried longer prompts with more detailed scoring breakdowns, but the model performed equally well with less instruction and responded faster. The key insight: Claude is already good at judgment calls. You just need to tell it what dimensions to evaluate and what format to return.

Score Categorization

Raw scores map to three categories with configurable thresholds:

Category	Default Threshold	Action
Hot	Score >= 70	Push to CRM webhook immediately, enqueue for hot nurture sequence
Warm	Score >= 40	Push to CRM webhook, enqueue for warm nurture sequence
Cold	Score < 40	Store for analysis, no outbound action

The thresholds are configurable per deployment via environment variables (HOT_THRESHOLD, WARM_THRESHOLD). A client with a broad ICP might set WARM_THRESHOLD to 30. A client with a narrow ICP and limited SDR capacity might set HOT_THRESHOLD to 85.

Configurable Criteria for Different Clients

The same system qualifies leads for a FinTech startup and an enterprise SaaS company. The scoring rubric stays the same. What changes is the context.

Each client deployment receives optional SCORING_CRITERIA — a JSON configuration that injects client-specific context into the scoring prompt:

{
  "target_industries": ["SaaS", "eCommerce", "FinTech"],
  "target_company_sizes": ["50-500 employees", "startup", "SMB"],
  "budget_signals": ["budget", "per month", "k/month", "invest", "ROI"],
  "disqualifying_keywords": ["student", "free", "intern", "job"]
}

When scoring criteria are present, the user message sent to Claude includes additional context:

Name: Jane Smith
Email: jane@acmecorp.com
Company: Acme Corp
Source: website
Message: We need help automating our sales pipeline...

Additional scoring context:
- Target industries: SaaS, eCommerce, FinTech
- Target company sizes: 50-500 employees, startup, SMB
- Budget signal keywords (boost score if present): budget, per month, k/month
- Disqualifying keywords (reduce score if present): student, free, intern

This approach has a major advantage over traditional lead scoring: the model does not just check for keyword matches. It understands context. “We are a free-trial SaaS company looking for lead automation” contains the word “free” but clearly signals a legitimate business need. A regex-based scorer would penalize it. Claude does not.

Per-Client Deployment

Each client gets an isolated Cloudflare Worker with its own D1 database:

# wrangler.toml for Client A
name = "lead-qualifier-client-a"

[[d1_databases]]
binding = "DB"
database_name = "lead-qualifier-client-a"
database_id = "<client-a-db-id>"

Data isolation is absolute. Client A cannot see Client B’s leads. Each deployment has its own thresholds, scoring criteria, webhook URL, and nurture sequences. Deploying a new client takes under 10 minutes:

Create a D1 database
Apply the migration
Set secrets (API key, webhook URL)
Configure scoring criteria
Deploy

CRM Integration via Webhooks

Hot and warm leads are pushed to the client’s CRM immediately after scoring. The webhook payload includes everything the sales team needs:

{
  "id": "uuid",
  "name": "Jane Smith",
  "email": "jane@acmecorp.com",
  "company": "Acme Corp",
  "source": "website",
  "message": "We need help...",
  "score": 82,
  "category": "hot",
  "follow_up": "Hi Jane, your need for pipeline automation is exactly what we solve...",
  "created_at": "2026-04-11T07:30:00.000Z"
}

Webhooks are signed with HMAC-SHA256 when a shared secret is configured. The CRM endpoint verifies the X-Signature-SHA256 header before processing. If the webhook fails, the lead is still saved to D1 — webhook delivery is non-fatal and can be retried.

Nurture Sequence Auto-Enrollment

Beyond the immediate CRM push, qualified leads are automatically enrolled in nurture sequences via Cloudflare Queues. Hot leads go into an aggressive, short-cycle sequence. Warm leads go into a longer educational sequence. Cold leads stay in the database for analysis but receive no outbound contact.

The queue message includes the lead data and a sequence identifier, so the nurture service knows exactly which sequence to start and has full context for personalization.

Results and Performance

After running this system across multiple client deployments, here is what we observed:

Speed

Average end-to-end latency: 1.2 seconds from form submission to scored, categorized, and CRM-pushed lead
P95 latency: 2.1 seconds
Claude API scoring: 300-500ms typical
D1 write + webhook push: under 100ms combined

For comparison, the average B2B company takes 42 hours to respond to a lead. This system responds before the prospect closes the thank-you page.

Accuracy

We validated scoring accuracy against a human-reviewed sample of 200 leads:

Metric	Result
Hot leads correctly identified	91% precision
Warm leads correctly categorized	84% precision
Cold leads correctly filtered	96% precision
Overall agreement with human reviewers	88%

The system’s biggest strength is filtering cold leads. It reliably identifies students, job seekers, and tire-kickers that would otherwise consume SDR time. Its biggest weakness is borderline warm/hot leads from companies in unfamiliar industries — exactly the kind of judgment call that improves as you refine the SCORING_CRITERIA per client.

Business Impact

Across deployments:

SDR time per qualified lead: Dropped from 8-12 minutes to under 30 seconds (reviewing the AI summary and follow-up)
Lead response time: From 4-24 hours to under 2 seconds
Cost per scored lead: Under $0.003 (Claude Haiku API cost + Cloudflare Workers, which runs in the free tier for most volumes)
Webhook delivery rate: 99.7% on first attempt

Cost Breakdown

Component	Cost Per Lead
Claude Haiku API	~$0.002
Cloudflare Workers	Free (under 100k requests/day)
Cloudflare D1	Free (under 5M reads/day, 100k writes/day)
Cloudflare Queues	~$0.0004 per message
Total	~$0.003

Compare that to the loaded cost of an SDR manually qualifying the same lead: $5-15 per lead depending on salary and volume.

Lessons Learned

1. Start With the Rubric, Not the Code

We spent more time designing the 100-point scoring rubric than writing the Worker. The rubric is the product. The code is just plumbing. Get the dimensions and weights wrong, and the system confidently generates wrong scores.

2. Claude Does Not Need Over-Specification

Our first system prompt was 800 words with detailed sub-rubrics, edge case handling, and example scores. We cut it to 150 words and scoring accuracy stayed the same. The model already understands what makes a good lead. It just needs to know your dimensions and format.

3. Configurable Criteria Beat Fine-Tuning

Instead of training a custom model per client, we inject client context into the scoring prompt at runtime. This is operationally simpler (no training pipeline, no model management), faster to deploy (update an env var vs. retrain), and transparent (you can read exactly what the model sees).

4. Non-Fatal Webhooks Are Essential

The webhook push to CRM was originally in the critical path — if it failed, the whole request failed. We moved it to a non-fatal pattern: save the lead first, attempt the webhook, log failures but return success. The lead data is always safe. Webhook delivery is an eventual-consistency concern, not a transaction concern.

5. Edge Deployment Changes the Product

Running at the edge is not just a performance optimization. It changes what the product can promise. When you can guarantee sub-2-second qualification anywhere in the world, you can embed the qualifier directly in the lead capture form and show personalized next steps on the thank-you page. That is a fundamentally different user experience than “we will get back to you.”

6. Per-Client Isolation Simplifies Everything

We considered multi-tenant with row-level security. We chose per-client Workers with separate D1 databases instead. More deployments, but each one is dead simple: one Worker, one database, one config. No tenant ID bugs, no cross-contamination risk, no complex authorization layer. Cloudflare makes deploying another Worker trivial.

What We Would Build Next

The current system handles inbound qualification. The natural extensions:

Enrichment layer: Before scoring, call a company data API to append firmographic data. The model scores better with structured context about company size, industry, and tech stack.
Feedback loop: When SDRs mark leads as “converted” or “junk” in the CRM, feed that back to refine scoring criteria automatically.
Multi-channel intake: The current system handles form submissions. Adding email parsing and chat intake would cover the full inbound surface.
A/B testing follow-ups: Generate multiple follow-up variants and measure which ones get replies.

The foundation is solid. The architecture — stateless Workers, structured scoring, configurable criteria — supports all of these extensions without redesign.

Building something similar? We open-sourced the architecture patterns behind this system. If you want the full technical blueprint for deploying AI lead qualification for your SaaS, grab the free roadmap.

How We Built an AI Lead Qualifier in 2 Weeks (Architecture, Scoring Rubric, and Results)