Tutorial detail

Markdown Code Spans, Fenced Code Blocks, and Raw HTML

Step 8 • Intermediate

Dissect code-related syntax and HTML interactions from parsing and security perspectives.

Code and raw HTML behavior define many rendering and sanitization boundaries.

Code as literal content

Markdown is widely used for technical writing because it makes code examples easy to include. Code syntax has a special role in the language: it marks content that should be interpreted literally rather than parsed as Markdown. This protects examples from being transformed accidentally. If you write *emphasis* inside a code span or code block, the asterisks should remain visible rather than becoming italic text.

CommonMark defines inline code spans, indented code blocks, and fenced code blocks. Each has different parsing behavior and authoring tradeoffs. Inline code spans are for short fragments inside prose. Code blocks are for multi-line examples. Fenced code blocks are usually preferred in modern technical documentation because they are explicit, support info strings, and avoid indentation ambiguity.

Inline code spans

An inline code span is delimited by backticks. The simplest form is code, but code spans can use longer backtick sequences when the content itself contains a backtick. For example, a code span can be wrapped in double backticks to include a single literal backtick inside it. This is a small feature, but it matters in documentation about Markdown, shells, templates, and programming languages.

Code span contents are normalized according to CommonMark rules. Leading and trailing spaces can be handled specially, and line endings inside code spans are treated differently from ordinary prose. The key idea is that Markdown syntax inside the span is not parsed. A link-looking string remains literal. Emphasis markers remain literal. Entity references are not interpreted as normal inline text.

Inline code helps clarify exact syntax. Readers can visually distinguish commands, function names, media types, filenames, and punctuation from prose. Use inline code for exact tokens such as text/markdown, Content-Type, mdast, or cmark.

Raw HTML policy becomes more important when code examples and rendered HTML appear in the same document. If a tutorial explains HTML blocks, fenced examples should protect sample markup from being interpreted as live HTML. This connects to GitHub Flavored Markdown behavior, where raw HTML rules and post-processing are part of the platform contract.

Indented code blocks

Indented code blocks are part of original Markdown. A block indented by four or more spaces becomes code, and one level of indentation is removed in the rendered content. This is readable in plain text, but it can create accidental code blocks when prose is indented unintentionally.

Indented code blocks also interact with lists. A code block inside a list item often needs additional indentation so it belongs to the list item. This is one reason technical documentation teams often prefer fenced code blocks. Fences make the code boundary obvious even when nested.

Indented code blocks have no info string. That means they cannot directly indicate a programming language for syntax highlighting. A renderer may still guess, but guessing is less reliable than explicit metadata. In modern docs, indented code blocks are best reserved for compatibility or simple examples.

Fenced code blocks

Fenced code blocks use at least three backticks or tildes. The opening and closing fence must use the same character, and the closing fence must be at least as long as the opening fence. This allows authors to include shorter fence sequences inside longer fenced blocks when documenting Markdown itself.

The optional info string after the opening fence is commonly used for language identification. For example, a fence with js may render with JavaScript highlighting. CommonMark does not mandate exactly how the info string becomes HTML classes, but many renderers follow similar conventions. Tooling often uses the first word as the language identifier.

Fenced code blocks can interrupt paragraphs and do not require blank lines in all cases, but style guides should still put blank lines around them. This keeps source readable and avoids edge-case confusion. For tutorial content, code fences are the most maintainable way to show commands, Markdown examples, HTML output, AST fragments, and configuration.

Raw HTML in Markdown

Raw HTML is one of Markdown’s most powerful and controversial features. Original Markdown allowed HTML because Markdown was designed for web publishing and intentionally covered only a subset of HTML. If Markdown could not express something, authors could drop down to HTML.

From a parser perspective, raw HTML introduces different block and inline behavior. Some HTML blocks are treated as raw content. Markdown syntax may not be processed inside certain block-level HTML regions. Inline HTML may appear inside paragraphs. CommonMark defines HTML block categories to make this behavior precise.

From a security perspective, raw HTML is risky in user-generated content. A renderer that converts Markdown to HTML without sanitization can expose users to script injection, unsafe attributes, tracking pixels, or layout-breaking markup. Many platforms sanitize or disable raw HTML. GitHub, for example, performs additional post-processing and sanitization after rendering.

Production guidance

A production Markdown system should define a raw HTML policy. The options are: allow it for trusted authors, sanitize it, escape it, strip it, or reject it. The right choice depends on the product. A private technical note app may allow more than a public multi-user publishing platform. A file storage app that previews private Markdown still needs care because previews run in a browser context.

For code blocks, prefer fenced code with explicit info strings. Decide which language identifiers are supported. If syntax highlighting is used, make sure unknown languages degrade gracefully. Avoid using indentation-based code in complex nested content unless the style guide requires it.

FAQ

What is the difference between inline code and a code block?

Inline code appears inside a paragraph for short tokens. A code block is a multi-line block for examples, commands, or source code.

Why are fenced code blocks preferred?

They are explicit, support language info strings, and avoid many indentation-related mistakes.

Can Markdown syntax work inside code blocks?

No. Code block content is treated as literal text, so Markdown markers are not parsed as formatting.

Is raw HTML part of Markdown?

Many Markdown dialects support raw HTML, and CommonMark specifies HTML block behavior. Some products disable or sanitize it for security.

Should user-generated Markdown allow raw HTML?

Only with a clear security policy. Public or collaborative systems should sanitize, escape, strip, or reject unsafe HTML.

Revisit container boundaries in Markdown Lists, Blockquotes, and Container Blocks.
Continue inline literal handling in Markdown Inline Semantics.
Compare GFM HTML policy in GitHub Flavored Markdown.
Test fences and HTML blocks with CommonMark Test Suite and Dingus.
Design safe preview behavior in Designing Production Markdown Systems.

Continue with Inline Semantics: Emphasis, Escaping, and Entities.

References

Navigation

Previous step Next step Tutorial index

Series map