Pipeline architecture separates parsing, transformation, and output concerns cleanly.
The unified mental model
The unified ecosystem provides a processor architecture for content formats. In the Markdown world, remark is the Markdown-focused part of that ecosystem. The basic pipeline is parse, transform, and stringify or compile. Source Markdown becomes an AST, plugins transform the AST, and the result becomes Markdown, HTML, or another output.
This architecture is powerful because each stage has a clear responsibility. The parser handles syntax. Plugins handle structured changes. The compiler handles output. Instead of mixing string manipulation, rendering, and validation in one step, unified encourages composable processing.
For production Markdown systems, this separation is a major advantage. You can add a plugin to validate headings without changing the parser. You can rewrite links before rendering. You can extract metadata for search indexing. You can generate a table of contents from heading nodes. Each operation works on the tree rather than raw text.
remark and Markdown parsing
remark parses Markdown into mdast. From there, plugins can inspect or modify nodes. Some plugins lint. Some transform. Some bridge Markdown to HTML through hast, the HTML AST used in the same ecosystem. This makes it possible to build sophisticated pipelines while keeping each tool focused.
For example, a documentation site might parse Markdown, validate frontmatter, check links, add heading slugs, transform admonition syntax, sanitize HTML, render to HTML, and collect search metadata. In a unified pipeline, these steps can be separate plugins with predictable order.
Order matters. A plugin that rewrites links must run before rendering. A plugin that checks heading IDs may need to run after slug generation. A sanitizer belongs near the HTML stage. Treat the pipeline as architecture, not a bag of plugins.
Plugin design
A good plugin has a narrow responsibility. It should document what nodes it reads, what nodes it writes, and which options affect behavior. It should avoid surprising global side effects. If it introduces syntax, it should clearly state whether the syntax is Markdown, mdast, or HTML-level behavior.
Plugins can also create portability risks. If your Markdown source depends on a custom plugin, another renderer may not understand it. This is acceptable for controlled systems, but it should be documented. A plugin-based extension is still an extension to the authoring contract.
For long-term content storage, prefer plugins that transform explicit, documented syntax. Avoid hidden transformations that make source meaning depend on private code no one remembers later.
Validation pipelines
Unified and remark are excellent for validation. A pipeline can reject documents that violate product rules before they are published. For example, a tutorial platform can require a title, summary, one top-level heading, at least three internal links, reference URLs from approved domains, and no raw HTML. These rules can be expressed structurally.
Validation should report precise errors. If a plugin detects an image without alt text, it should include file and position. If a link is invalid, it should identify the target. Author experience matters. Strong validation with poor messages feels hostile; strong validation with clear fixes builds trust.
Transform pipelines and migrations
Remark pipelines are also useful for migrations. Suppose a site changes from /docs/guide to /tutorial/guide. A transform can update internal Markdown links safely. Suppose a style guide changes heading depths. A transform can adjust heading nodes. Suppose raw HTML callouts need to become custom directive syntax. A transform can rewrite them consistently.
Migrations should be tested with fixtures and reviewed like code. AST transforms are safer than regex, but they can still encode wrong assumptions. Always run transformations on representative documents and compare rendered output.
Rendering and sanitization
When Markdown becomes HTML, sanitization becomes important if content is untrusted. unified pipelines can include sanitization steps, but the policy must be explicit. Sanitization is not the same as parsing. A Markdown parser can preserve or render raw HTML; a sanitizer decides what output is allowed.
For user-generated content, never assume a Markdown renderer makes HTML safe by default. Design the pipeline so security is a named stage with tests.
Pipeline observability
Large Markdown pipelines should be observable. Log plugin versions, expose meaningful build errors, and keep fixtures for each stage. When output changes, teams need to know whether the parser, a transform plugin, the sanitizer, or the compiler caused the difference.
Observability also helps authors. A pipeline that fails with “build error” is frustrating. A pipeline that says “external link in heading section failed validation at line 32” is actionable. Good content tooling treats diagnostics as part of the user experience.
Keeping pipelines maintainable
Avoid adding plugins casually. Every plugin becomes part of the content contract and upgrade surface. Prefer a small number of well-understood transforms over a large stack of overlapping plugins. Version plugin configuration with the project, and test rendered output after upgrades.
When a pipeline grows, document the order of operations in plain language. Authors and engineers should know whether validation happens before link rewriting, whether raw HTML is sanitized before rendering, and whether generated headings are available to later plugins.
FAQ
What is unified?
unified is a processor ecosystem for parsing, transforming, and compiling content through syntax trees.
What is remark?
remark is the Markdown-focused part of unified, commonly used to parse Markdown into mdast and run plugins.
Why use a pipeline for Markdown?
A pipeline separates parsing, validation, transformation, and rendering, making complex workflows easier to maintain.
Are remark plugins portable?
Plugin behavior is portable only within systems that run the same plugin. Custom plugin syntax should be documented as an extension.
Can unified help with security?
It can be part of a secure pipeline, but sanitization must be explicitly configured and tested.
Related tutorials
- Start with Markdown AST with mdast.
- Review parser constraints in Markdown Parser Implementation Theory.
- Check link transforms in Markdown Links, Images, and Reference Definitions.
- Compare dialect inputs in Markdown Dialects Compared.
- Design a full workflow in Designing Production Markdown Systems.
Continue with Markdown Dialects Compared.