Data Flow Integrity & CMS Architecture

Where does your content live? How does it get to the browser? The answers determine whether your site scales or breaks the moment you need to change anything.

Read the architecture
Mpheroane Harrison
· 10 min read · About Harrison

Most web projects start with design. A designer creates mockups, a developer converts them to HTML, and someone fills in the content later. This process produces sites where the content is structurally dependent on the design — change the design, and the content breaks. Change the content, and the design breaks.

System intelligence reverses this. The data architecture is designed first, the components consume that data, and the visual design is the result of that consumption. Content becomes independent of layout. Layout becomes independent of content source. The system is resilient.

The data flow question

For every piece of content on your site, you should be able to answer:

  1. Where is it stored? (CMS, markdown files, database, hardcoded in HTML)
  2. What format is it in? (JSON, markdown, HTML, YAML)
  3. How does it reach the browser? (server-rendered, API-fetched, statically generated)
  4. Who can edit it? (developer, client, nobody)
  5. What happens if the source changes? (rebuild required, instant update, manual update needed)

If you can't answer these five questions for every content type on your site, your data flow has integrity gaps. Those gaps become rebuilds.

Three content architectures compared

Architecture A: Hardcoded (fragile)

All content is written directly into HTML files. To change a testimonial, you edit the HTML. To add a new service, you create a new HTML file and manually add navigation links.

Integrity: None. Content is inseparable from markup. Any change requires a developer.

Common in: R3 000–R8 000 "brochure sites."

Architecture B: Coupled CMS (better, but coupled)

Content lives in WordPress/Drupal. The CMS generates the HTML on each request. Content is editable through an admin panel, but the output is tightly coupled to the theme — changing the theme can break content formatting.

Integrity: Moderate. Content is separated from markup, but the rendering layer is fragile and theme-dependent.

Common in: R10 000–R30 000 WordPress builds.

Architecture C: Decoupled/Static (resilient)

Content lives in a CMS or markdown files. At build time, a static site generator fetches the content via API or reads the files, injects it into components, and outputs pure HTML. The result is deployed to a CDN.

Integrity: High. Content is fully independent of rendering. Swap the CMS without touching the frontend. Swap the frontend without touching the CMS.

Common in: System-intelligent builds of any price point.

The ideal data flow

Here's the data flow I use for every system-intelligent build:

Content Source
(CMS / Markdown)
API Layer
(REST / GraphQL)
Build Step
(Astro / Eleventy)
Static HTML
(CDN)

Let me break down why each stage matters:

Content Source

This is where the client or editor actually writes content. Options include:

The key: the content source knows nothing about the frontend. It stores data, not markup.

API Layer

The content source exposes its data through a clean API. For a headless CMS, this is built in. For markdown/JSON files, the static site generator reads them directly (which is effectively a local API).

This layer means you could replace Sanity with Strapi, or markdown files with a database, and the rest of the system wouldn't change. The API contract is the stable interface.

Build Step

The static site generator fetches content from the API, maps it to components, and outputs HTML. This happens at deploy time, not request time. The result is a set of static HTML files that contain all the content — no API calls in the browser.

This is what gives you sub-second load times (as discussed in the performance architecture post) while still having a proper content management layer.

Static HTML on CDN

The final output. Pure HTML files served from edge nodes. No server-side rendering, no database queries, no PHP execution. Just files. The fastest, most resilient delivery method available.

Content modelling before visual design

Before any component or layout work begins, I model every content type as a data structure. Here's an example for a "Service" content type:

{
  "slug": "web-design",
  "title": "Web Design & Development",
  "tagline": "Custom websites built as systems, not pages.",
  "description": "Full-service web design including strategy...",
  "icon": "laptop-code",
  "image": "/img/services/web-design.webp",
  "imageAlt": "Web design project on laptop screen",
  "priceRange": "R15 000 – R40 000",
  "features": [
    "System-intelligent architecture",
    "Performance-optimised delivery",
    "SEO built into every page",
    "Conversion-focused design"
  ],
  "process": [
    { "step": 1, "title": "Discovery", "description": "..." },
    { "step": 2, "title": "Architecture", "description": "..." },
    { "step": 3, "title": "Design", "description": "..." },
    { "step": 4, "title": "Development", "description": "..." },
    { "step": 5, "title": "Launch", "description": "..." }
  ],
  "faq": [
    { "question": "How long?", "answer": "2-4 weeks" },
    { "question": "What's included?", "answer": "..." }
  ],
  "relatedServices": ["seo", "conversion-design"],
  "ctaText": "Get a web design quote",
  "ctaLink": "/contact?service=web-design"
}

Once you have this data model, the component work is mechanical: map each field to a component prop. The design decisions become about how to render the data, not what data to render.

Headless vs. coupled CMS

For South African businesses, I recommend headless CMS in these scenarios:

I recommend markdown/JSON files in these scenarios:

I rarely recommend coupled CMS (WordPress with themes) anymore. The performance penalty, security surface, and vendor lock-in aren't justified when better alternatives exist.

The SA scalability problem

South African businesses have a predictable growth pattern:

  1. Month 0: "We just need a simple website."
  2. Month 6: "Can we add a blog?"
  3. Month 12: "Can we add online bookings/payments?"
  4. Month 18: "We need to support English and Afrikaans/Zulu."
  5. Month 24: "Can we add a client portal?"

If your data architecture is hardcoded HTML, each of these requests requires a rebuild. If it's a coupled WordPress theme, each requires increasingly hacky plugin combinations. If it's a decoupled architecture with proper content modelling, each is an extension — adding a new content type, a new API route, or a new component — not a reconstruction.

The 5 data integrity rules

  1. Content must be editable without touching code. If the client needs a developer to change a testimonial, your data flow is broken.
  2. Content must be portable. You should be able to export all content as JSON/markdown and import it into a different system without data loss.
  3. Content must validate. Every field should have type constraints, required/optional flags, and sensible defaults. A missing image shouldn't crash the build.
  4. Content must be version-controlled. Every change should be trackable. Markdown files in Git handle this automatically. Most headless CMSes offer versioning.
  5. Content must be schema-aware. The content model should map directly to your structured data schema (as discussed in the structured data post). If your Service content type has a priceRange field, that same field should feed both the visible page and the Service schema markup.

Next: the final post in this series — Putting It All Together: The SA Web Development Blueprint — where every pillar converges into one actionable framework.