Back to Work  /   Case Study

SEOCore.

An enterprise-grade, multi-threaded SEO crawler, rule engine, and link graph analyzer — built in TypeScript for speed, compliance, and deep site health audits.

RoleDesign · Engineering · Architecture
Year2026 — Present
PlatformNode.js · CLI · SDK
Status Live

SEO auditing tools are either too shallow or too opaque.

Most SEO tools on the market fall into two camps: lightweight checkers that barely scratch the surface, or bloated SaaS platforms that hide their logic behind a dashboard and a monthly subscription. Neither gives developers and SEO specialists the depth, control, or transparency they need to truly understand a site's health.

The real work of SEO — crawl analysis, redirect chain tracing, structured data validation, link authority scoring, AI visibility auditing — requires a tool that can go deep and explain itself. Existing CLI tools are fragmented, single-purpose scripts. No single open-source solution combined high-performance crawling with a declarative rule engine and production-ready reporting.

Build a crawler that thinks — then teach it to explain.

SEOCore was designed as a modular, tiered auditing platform. Every decision flows from two principles: depth should be configurable, and every finding should be traceable back to the rule that produced it.

  1. Execution Tiers From Fast to Enterprise

    Four tiers drive crawl limits, rule selection, and scoring behavior. Fast runs core rules on one page. Standard adds performance and 100 pages. Deep enables all modules with Playwright rendering at 500 pages. Enterprise unlocks plugins, Lighthouse sampling, and 5,000 pages.

  2. Concurrent Crawler Rate-Limited & Resilient

    Built on a custom HTTP engine with Bottleneck rate-limiting and p-queue concurrency. Respects robots.txt, extracts sitemaps automatically, and handles retries, backoff, and timeouts gracefully.

  3. Declarative Rules Compiled & Traceable

    A modular rule system where each audit is a declarative rule with a clear pass/fail boundary. Rules are compiled, scored, and reported with full traceability — no black-box scoring.

  4. Graph Analysis Link Authority & Orphan Detection

    Computes in-degree, out-degree, and PageRank-style authority scores across the crawl graph. Flags orphan pages, structural dead ends, and internal linking opportunities.

A monorepo built for extensibility at scale.

Nx Monorepoover Single Package

Nine packages — cli, engine, crawler, analyzers, rules, scoring, config, sdk, reporter — each with clear boundaries. Independent versioning, shared types, and parallel builds.

Cheerio + Playwrightover Single Parser

Cheerio for fast static HTML parsing. Playwright as an optional tier for client-rendered SPAs. The same analyzers run against both, producing comparable results.

Zod Schema Validationover Manual Checks

All configuration is validated through Zod schemas with clear error messages. Presets for each tier ship out of the box, but every knob is exposed for customization.

EventBus Reportingover Direct Output

A custom EventBus decouples crawling from reporting. Terminal, JSON, HTML, and SARIF reporters all consume the same event stream — add a new format without touching the engine.

Fifteen audit modules. One unified report.

AI Visibility Auditor

Evaluates brand visibility across AI crawlers. Audits robots.txt, sitemaps, and llms.txt rules for GPTBot, ClaudeBot, PerplexityBot, and Google-Extended.

Structured Data Graph

Compiles Schema.org JSON-LD, Microdata, and RDFa into an Entity Graph. Detects broken references, duplicate entities, and coverage gaps.

Mobile SEO Scorer

Evaluates mobile usability, simulated Core Web Vitals, responsive design quality, and mobile-first indexing readiness with strict verification guards.

E-E-A-T Analyzer

Scores Experience, Expertise, Authoritativeness, and Trustworthiness. Analyzes readability, content structure, keyword density, and AI citation readiness.

Image Audit

Site-wide image analysis for SEO, performance, and accessibility. Covers format, size, lazy-loading, alt text, CLS risk, and LCP image detection with byte-weighted scoring.

Technology Detection

Evidence-based stack detection with deterministic confidence scores. Identifies frameworks, CDNs, CMS packages, analytics, and rendering strategies.

JavaScript SEO Impact

Compares raw HTML against rendered DOM to detect SEO-relevant changes from client-side JavaScript. Flags metadata, heading, content, and structured data parity issues.

Audit Snapshots & Diff

Save audit snapshots with --save. Compare against previous runs with --diff. CI mode fails only on regressions, integrating cleanly into deployment pipelines.

Output that fits the workflow.

SEOCore exports findings in four formats: rich colored terminal tables for quick scans, structured JSON for programmatic consumption, styled HTML reports for stakeholder sharing, and SARIF for security and compliance tooling integration.

Every report includes severity levels, actionable recommendations, and direct references to the rule that triggered each finding. The --dry-run flag lets users preview the full audit configuration before a single request is made.

Next Case Study
WebTrace
2026 · Privacy-first browser history manager