HTML — HyperText Markup Language
Status: 🟩 COMPLETE Last updated: 2026-06-20 Plain-English tagline: The language web pages are written in — it describes what’s on the page and how it’s structured, not how it looks.
In plain English
HTML is a markup language. That means it doesn’t do things the way a programming language does — it just describes things. Specifically, it describes the structure and content of a web page: this is a heading, this is a paragraph, this is a list, this is a link, this is an image.
If a web page were a magazine article, HTML would be the equivalent of writing it with editorial markup — “this line is the headline, this is a sub-heading, this is body text, this is a quote, this is a photo caption.” The browser’s job is to read that markup and turn it into something humans can look at.
HTML was invented by Tim Berners-Lee in 1989–1993 at CERN, originally as a way for scientists to share documents. It is one of three core languages of the web, alongside CSS (which controls how things look) and JavaScript (which makes things interactive). HTML alone gives you the structure; the other two add the visual style and the behaviour.
The current version is HTML5, which is a living standard maintained by the WHATWG. It is updated continuously, so there is no longer a “version number” like HTML 4 or HTML 5.1 — there is just “HTML.”
Why it matters
HTML is the substrate of the web. Every website you’ve ever visited — Wikipedia, Google, your bank, this encyclopedia rendered in a markdown viewer — ultimately becomes HTML before it reaches a browser. Frameworks like React and Next.js eventually compile down to HTML for the browser to render.
Understanding HTML matters because:
- Semantics affect accessibility and SEO. Using
<button>vs<div onclick=...>is the difference between a screen reader announcing “button, submit” and announcing nothing at all. Using<h1>correctly is how Google’s crawler understands what your page is about. - It’s the lowest layer where you can actually look. When something looks broken, opening DevTools and reading the rendered HTML is the most reliable debugging move.
- React and JSX look like HTML. If you understand HTML, you understand 80% of JSX. (The other 20% is the gotchas — see below.)
How it works
Elements and tags
An HTML document is made of elements. Each element is written with tags that wrap content:
<p>This is a paragraph.</p><p>is the opening tag.</p>is the closing tag.This is a paragraph.is the element’s content.
Some elements are void — they have no content and no closing tag (e.g. <br>, <img>, <input>).
Attributes
Tags can carry attributes, which are extra information about the element:
<a href="https://example.com" target="_blank">Visit example</a>
<img src="/photo.jpg" alt="A black cat sitting on a windowsill" />hrefsays where a link points.srcsays where an image file lives.altdescribes the image for screen readers and when the image fails to load.
The page skeleton
Every HTML document has the same skeletal shape:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>My page</title>
</head>
<body>
<h1>Hello, world</h1>
<p>This is my first web page.</p>
</body>
</html>Breaking that down line by line:
<!DOCTYPE html>— tells the browser “this is HTML5.” Must be the first line.<html lang="en">— the root element of the whole document.langhelps screen readers and translation tools.<head>— metadata about the page that the user doesn’t see directly: title, encoding, viewport settings, stylesheets, scripts.<meta charset="utf-8">— the character encoding. Without this, accented characters like “é” can break. See Text encodings & UTF-8.<meta name="viewport" ...>— tells mobile browsers how to scale the page. Essential for responsive design.<title>— the text shown in the browser tab.<body>— everything the user actually sees.
The DOM tree
When a browser reads HTML, it builds a tree-shaped data structure in memory called the Document Object Model (DOM). Each element is a node; nested elements are children. This is what JavaScript reads and modifies when it changes a page after it loads.
html
├── head
│ ├── meta (charset)
│ ├── meta (viewport)
│ └── title
└── body
├── h1
└── p
Block vs inline elements
Elements have a default display behaviour:
- Block elements take up the full width of their container and start on a new line. Examples:
<div>,<p>,<h1>–<h6>,<section>,<article>,<nav>,<header>,<footer>,<ul>,<ol>,<li>. - Inline elements flow within the surrounding text. Examples:
<span>,<a>,<strong>,<em>,<img>,<code>.
You can change this behaviour with CSS (display: block, display: inline, display: flex, etc.). The HTML default is just a starting point.
A concrete example
A small but realistic page:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<title>George's Cat Blog</title>
</head>
<body>
<header>
<h1>George's Cat Blog</h1>
<nav>
<a href="/">Home</a>
<a href="/about">About</a>
</nav>
</header>
<main>
<article>
<h2>Why my cat is the best</h2>
<p>Published <time datetime="2026-06-19">19 June 2026</time></p>
<p>My cat is named <strong>Biscuit</strong>. He sleeps 18 hours a day.</p>
<img src="/biscuit.jpg" alt="An orange tabby cat curled up on a blue blanket" />
</article>
</main>
<footer>
<p>© 2026 George</p>
</footer>
</body>
</html>Notice the semantic elements: <header>, <nav>, <main>, <article>, <footer>. These don’t change how things look — they tell browsers, search engines, and screen readers what each chunk of the page means. A <nav> is announced as “navigation”; a <main> lets a screen reader user jump straight to the content; an <article> tells Google “this is a self-contained piece of content.”
You could write the same page using only <div>s and it would look the same. But the semantic version is more accessible, more discoverable, and easier to read for the next person (or LLM) who works on it.
Common HTML elements — the working set
A short list of the elements that come up daily. Memorize this set and you can read 90% of any HTML page:
| Element | Purpose |
|---|---|
<h1> to <h6> | Headings, biggest to smallest. Use one <h1> per page. |
<p> | Paragraph |
<a href="..."> | Anchor / link |
<img src="..." alt="..."> | Image |
<ul> / <ol> / <li> | Unordered (bullet) and ordered (numbered) lists; <li> is each item |
<div> | Generic block container — when no semantic element fits |
<span> | Generic inline container |
<strong> / <em> | Bold (semantically: strong importance) / italic (semantically: emphasis) |
<button> | A button. Use this — not a clickable <div> |
<input> | A form field |
<form> | Wraps inputs that submit together |
<label> | A label tied to an input (essential for accessibility) |
<header> / <main> / <footer> / <nav> / <article> / <section> / <aside> | Semantic page regions |
Common gotchas
-
HTML is forgiving — sometimes too forgiving. Browsers will try to render even badly-formed HTML. This makes errors easy to miss. Use the W3C validator when something looks off but you can’t tell why.
-
altis not optional. Every<img>should have one. If the image is decorative, setalt=""(an empty string) explicitly. Missing alt text breaks screen readers and hurts SEO. -
Don’t use headings as styling.
<h2>is not “make this medium-sized.” It’s “this is the second-level heading of the document.” If you want big bold text, use CSS — not the wrong heading level. -
Self-closing tags differ between HTML and JSX. In plain HTML5, void elements can be written
<br>or<br />. In JSX (React), you must write<br />— the slash is required. Forgetting this is a common error when moving between the two. -
Tag case sensitivity. HTML tag names are case-insensitive (
<DIV>works), but JSX is case-sensitive and lowercase = HTML, capitalized = React component. So<button>is an HTML button;<Button>is a React component calledButton. Mixing these up confuses both Claude and humans. -
Inline
styleattributes vs CSS classes. Both work, but inline styles override stylesheets and are hard to override later. Prefer CSS classes (or Tailwind utility classes) unless you have a specific reason. -
<button>defaults totype="submit"inside a form. That can accidentally submit the form. Addtype="button"to any button that isn’t meant to submit. -
HTML comments leak.
<!-- comment -->is visible in “View Source.” Don’t put secrets there. -
The
<head>is not the same as<header>.<head>is the invisible metadata zone.<header>is the visible top section of a page or article. They are unrelated.
See also
- CSS 🟩 — the styling layer
- JavaScript 🟩 — the interactivity layer
- The DOM 🟩
- Accessibility (a11y) 🟩
- Responsive design 🟩
- React 🟩 — where JSX (HTML-flavored markup) comes in
- Next.js 🟩 🟦 — the framework that turns React into a webapp
- Glossary: DOM, HTML
Sources
- MDN — HTML reference — the canonical reference, free, maintained by Mozilla
- WHATWG HTML Living Standard — the actual spec
- W3C HTML Validator — paste in your HTML to check for errors
- web.dev — Learn HTML — Google’s free course