A Playwright-based analyzer for WordPress / Elementor legacy sites that maps them to AEM / Edge Delivery Services blocks. Originally built for pharma HCP portal migrations. Adaptable to other CMSes (Gutenberg, AEM classic, Drupal, custom) by extending the detection adapter in scripts/analyze_page.mjs — see WORKFLOW.md for the “CMS adapter” story.
⚠ This is a consultant’s personal toolkit, not a general-purpose framework. It was grown iteratively for one specific migration problem (pharma HCP portals from WordPress/Elementor to EDS) and shared with others who want to take inspiration from the approach — not to be adopted wholesale. Most of the value is in the ideas and patterns, not the specific implementation. Read §Honest review before investing time in any part of this.
Before reading anything else, open these three HTML files from the frozen example-desmoid/ worked example — they’re the actual deliverables the kit produces, rendered from real data (11 pages of desmoidtumormanagement.de, WordPress + Elementor → AEM/EDS):
| File | What it shows |
|---|---|
example-desmoid/1-site-inventory/component-inventory.html |
Every unique component type detected on the site, with classification (composite/standalone), variants, atoms, and anatomy thumbnails |
example-desmoid/1-site-inventory/site-overview.html |
Page templates, site tree, template×component matrix, per-page component breakdown |
example-desmoid/2-eds-mapping/component-mapping.html |
Each legacy component mapped to an EDS block with status (direct/indirect/gap), a summary overview table, and detailed side-by-side comparison sections |
Plus per-page Playwright output under example-desmoid/1-site-inventory/pages/<slug>/: clean screenshot, annotated screenshot with colored overlays, components.json with bounding rects, and an interactive review.html.
Open these files in a browser first. Visual output answers “is this approach worth investigating?” faster than any amount of documentation — if the HTMLs don’t match what you’re looking for, you can close the tab and move on.
Terminology matters because this kit does two separate mappings and they get conflated easily:
| Phase | Mapping | Where it lives | Who maintains it |
|---|---|---|---|
| A — Detection | Raw DOM element (Elementor widget, CSS class, shape-based heuristic) → generic component type (hero-section, card, accordion, etc.) |
scripts/analyze_page.mjs (WIDGET_MAP near the top + inline heuristics for composites + ANATOMY_DEFS for sub-components) |
Developers — edit code when detection is wrong or a new CMS is introduced |
| B — EDS mapping | Generic component type → EDS block (hero, cards, accordion, contact-form, etc.) |
2-eds-mapping/component-mapping.json (hand-edited via /map-to-eds) |
Migration consultant — one row per (component type, variant) pair |
Between the two sits 1-site-inventory/taxonomy.json — the vocabulary of generic component types this project recognizes. It’s hand-authored per project and extended whenever analysis discovers a new type. Classifications (composite/standalone), variants, and atoms (the sub-elements inside composites) are all declared there.
So the pipeline is:
Legacy HTML ─(Phase A: WIDGET_MAP + heuristics)─► taxonomy types ─(Phase B: component-mapping.json)─► EDS blocks
analyze_page.mjs taxonomy.json /map-to-eds
The two phases live in different places and are maintained differently. Phase A is code — the detection rules are JavaScript inside scripts/analyze_page.mjs, edited by developers when a pattern is missed or misclassified. Phase B is data — one row per (type, variant) pair in 2-eds-mapping/component-mapping.json, edited by the migration consultant via /map-to-eds.
This kit was grown iteratively for one specific problem and shared for inspiration, not adoption. Some parts are worth stealing outright; some are under-designed and need rewriting before use. The short version is below — read it before committing time to any part of the project.
scripts/lib/render.mjs is ~25 lines. It replaces inline string concatenation (hack) and full template engines (overkill) with a middle-ground that keeps HTML readable and data logic clean. Applicable to any project that generates HTML reports./start-analysis, /analyze-page, /map-to-eds, /capture-eds) turn a CLI tool into a conversational workflow — ask questions, fill templates, run scripts, report. The pattern is reusable with any LLM.example-desmoid/ is more valuable than all the documentation combined for “what does good output look like”. Keeping a fully-populated example checked in (not auto-regenerated) is a pattern more projects should adopt.scripts/analyze_page.mjs is a 1,640-line monolith. The detection heuristics grew organically (page by page, not from a spec) and the file shows it. A more serious version would split into lib/detectors/hero-section.mjs, card.mjs, etc. — each a small module with a detect(dom) → result interface. Without this split, the “CMS adapter” story is aspirational rather than real.component-mapping.json schema grew muddled. Started clean ({site, variant, eds, match, notes}) and accreted edsCapture, edsAuthoring, edsSample, matchStyles as needs emerged. A stricter split into multiple files would be cleaner — the current single-file design is pragmatic but under-designed.capture_eds_blocks.mjs + /capture-eds) is speculative. Designed based on how the previous manual workflow felt, but never run against a real EDS dev server end-to-end. Plausible but untested — treat it as a design sketch, not working code.WORKFLOW.md is 300+ lines — too long for inspiration reading. A 50-line “design decisions” document would be more useful for a reader who isn’t planning to adopt the kit.Split analyze_page.mjs into detector modules with unit tests. That single refactor would unlock multi-CMS support, make detection regressions visible, and turn the “CMS adapter” concept from aspirational to real. Everything else listed above is secondary.
Four deliverables per project:
component-inventory.html — every unique component type, its classification (composite/standalone), variants, atoms, and an anatomy thumbnailsite-overview.html — page templates, site tree, template×component matrix, per-page breakdowncomponent-mapping.html — each legacy component mapped to an EDS block with match classification (direct/split/indirect/gap); when sample EDS blocks are captured, also includes side-by-side detailed comparison sectionseds-authoring-guide.md — a markdown deliverable for content editors: for each mapped block, the field structure, example values, and Google Doc table layoutPlus per-page Playwright output: clean + annotated screenshots, components.json with bounding rects, and an interactive review.html.
# 1. Install Playwright
npm install
npx playwright install chromium
# 2. Bootstrap a new project from the kit (interactive — asks URL, CMS, auth, pages)
/start-analysis
# 3. Analyze each page (runs Playwright analyzer + review-spec detection check)
cd <your-project>
/analyze-page home
/analyze-page about
/analyze-page products
# ... one per slug
# 4. Build inventory + overview
node ../component-mapping-kit/scripts/build_inventory.mjs
node ../component-mapping-kit/scripts/build_site_overview.mjs
# 5. Interactively map components to EDS blocks
# (also collects field structures for Phase 3)
/map-to-eds
# 6. Author the sample EDS blocks (conversational — done inside the EDS repo)
# Open your local EDS dev server, author each mapped block on a sample page.
# 7. Capture screenshots + regenerate detailed mapping + authoring guide
/capture-eds
# 8. Open the outputs
open 1-site-inventory/component-inventory.html
open 1-site-inventory/site-overview.html
open 2-eds-mapping/component-mapping.html
open 2-eds-mapping/eds-authoring-guide.md
component-mapping-kit/
├── README.md ← you are here
├── WORKFLOW.md ← detailed methodology + manual review loop
├── scripts/ ← analyze_page, capture_eds_blocks, build_*, lib/
├── templates/ ← blank config + HTML/markdown templates
├── skills/ ← /start-analysis, /analyze-page, /map-to-eds, /capture-eds
└── example-desmoid/ ← a full worked example (frozen, read-only)
├── GOAL.md
├── 1-site-inventory/
└── 2-eds-mapping/
Each project you analyze gets its own folder with two phase subfolders:
<project>/
├── GOAL.md
├── 1-site-inventory/ ← Phase 1: what's on the legacy site
│ ├── site-config.json
│ ├── taxonomy.json
│ ├── pages/<slug>/ ← per-page Playwright output
│ ├── component-inventory.html (generated)
│ └── site-overview.html (generated)
└── 2-eds-mapping/ ← Phase 2 + 3: mapping + EDS authoring
├── component-mapping.json ← mapping rules + edsCapture + edsAuthoring (edit via /map-to-eds)
├── component-mapping.html (generated)
├── eds-screenshots/ ← filled by /capture-eds
└── eds-authoring-guide.md (generated by build_authoring_guide.mjs)
/start-analysis and let it walk you through the bootstrap.WORKFLOW.md./analyze-page <slug>.WORKFLOW.md.