HTML Tutorial

Introduction to HTML

HTML - HyperText Markup Language - is the standard language for building web pages, and it's been the backbone of the web since the early 1990s even though it barely resembles what it was then. It's a markup language: you wrap content in tags to tell the browser what kind of thing each piece is - this is a heading, this is a paragraph, this is an image - and the browser uses those labels to decide how to display everything. Every webpage you've ever loaded is built on HTML underneath whatever CSS styling and JavaScript are running on top of it.

What is HTML?

  • HyperText - Text that can link to other text - the hyper part refers to the non-linear nature of following links, which was a genuinely new idea when the web was invented.
  • Markup Language - Uses tags to annotate content within a document, defining structure and meaning rather than appearance.
  • HTML Document - A file with a .html extension containing HTML that browsers read and render as a visual page.

HTML Document Structure: The Foundation of Everything

Every HTML file follows the same basic skeleton - four required pieces that tell the browser what kind of document it is, where the metadata lives, and where the visible content goes - and getting this structure right before adding anything else is the most important habit to build early. Browsers are forgiving enough to render broken structure sometimes, which can make it seem optional, but wrong structure causes subtle rendering inconsistencies across browsers, accessibility failures, and SEO problems that become harder to diagnose the more complex the page gets.

html
1<!DOCTYPE html>
2<html lang="en">
3<head>
4    <meta charset="UTF-8">
5    <meta name="viewport" content="width=device-width, initial-scale=1.0">
6    <title>Your Descriptive Page Title Here</title>
7    <link rel="stylesheet" href="styles.css">
8</head>
9<body>
10    <h1>My Main Heading</h1>
11    <p>My first paragraph.</p>
12    <script src="scripts.js"></script>
13</body>
14</html>

Key Structure Elements

  • DOCTYPE html - Tells the browser to render in standards mode rather than quirks mode. Must be the absolute first line, nothing before it.
  • html lang - The root container for everything. The lang attribute tells screen readers and search engines what language the content is in.
  • head - Contains metadata - page title, character encoding, CSS links, viewport settings. Never visible on the page.
  • body - Contains everything users actually see and interact with. All page content goes here.

HTML Elements and Tags

HTML elements are the individual labeled pieces of content that make up a page - headings, paragraphs, images, links - and each one is defined by its tags. The opening tag tells the browser the element is starting, the closing tag says it ends, and the content lives between them. Some elements are void elements like img and br that have no content and therefore no closing tag. Elements can also be nested inside each other, which is how you build the hierarchical structure that makes a page more than a flat list of content.

html
1<!-- Element with opening and closing tags -->
2<p>This is a paragraph element</p>
3
4<!-- Self-closing void element -->
5<img src="image.jpg" alt="Description">
6
7<!-- Nested elements -->
8<div>
9    <h2>Section Heading</h2>
10    <p>Paragraph inside a div</p>
11</div>

Element Structure

  • Opening Tag - Marks the beginning of an element: the tag name inside angle brackets.
  • Closing Tag - Marks the end of an element: same as opening tag but with a forward slash before the name.
  • Content - Whatever lives between opening and closing tags - text, other elements, or both.
  • Element - The complete package: opening tag, content, and closing tag together.
  • Void Elements - Elements with no content that don't have or need a closing tag: img, br, hr, input, meta, link.

HTML Attributes

Attributes are configuration options that go inside the opening tag and change what an element does or how it behaves. The href on a link says where the link goes. The src on an image says which file to load. The class on a div says which CSS styles apply. They follow the same syntax every time: attribute name, equals sign, value in double quotes. An element can have multiple attributes in any order.

html
1<!-- Attributes provide additional information -->
2<a href="https://example.com" title="Visit Example">Click Here</a>
3
4<img src="photo.jpg" alt="A beautiful landscape" width="500" height="300">
5
6<div class="container" id="main-content" data-category="tutorial">
7    Content here
8</div>

Common Attributes

  • href - The destination URL for links. Used on anchor and link elements.
  • src - The file path or URL for media to load. Used on img, script, audio, and video.
  • alt - Alternative text for images. Screen readers read this aloud; browsers show it if the image fails to load.
  • id - Unique identifier for one element. No two elements on the same page should share an id.
  • class - One or more class names for CSS styling. Multiple elements can share a class.
  • style - Inline CSS written directly on the element. Useful for testing but CSS files are better for real projects.

Headings and Paragraphs

Headings and paragraphs are the most used text elements - you'll have them on essentially every page. Headings h1 through h6 create a hierarchy from most to least important, and the level should reflect the document structure rather than how big you want the text to look visually - that's CSS's job. One h1 per page is the standard recommendation because it's the strongest signal to search engines about what the page is about.

html
1<!-- Headings define hierarchy -->
2<h1>Main Title (Most Important)</h1>
3<h2>Subheading</h2>
4<h3>Section Heading</h3>
5<h4>Sub-section Heading</h4>
6<h5>Minor Heading</h5>
7<h6>Least Important Heading</h6>
8
9<!-- Paragraphs for text content -->
10<p>This is a paragraph of text. It can contain multiple sentences and displays as a block with default spacing above and below.</p>
11
12<p>Another paragraph. Each gets automatic spacing from the browser's default styles.</p>

Text Formatting Tags

HTML has a set of inline elements for formatting text within paragraphs. The semantic/physical distinction matters here: strong and em carry meaning about importance and stress that screen readers respond to, while b and i are purely presentational. The distinction between b and strong, and between i and em, is worth learning early because using them correctly is part of writing accessible HTML.

html
1<!-- Text formatting examples -->
2<p>This text contains <b>bold</b>, <strong>strong</strong>, <i>italic</i>, and <em>emphasized</em> text.</p>
3
4<p>You can also <mark>highlight text</mark>, show <small>smaller text</small>, or <del>delete text</del>.</p>
5
6<p>Mathematical formulas: H<sub>2</sub>O and E = mc<sup>2</sup>.</p>
7
8<p><ins>Inserted text</ins> and <u>underlined text</u> have different semantic meanings.</p>

HTML Lists

HTML has three list types. Unordered lists are for collections where order doesn't matter. Ordered lists are for sequences where order is meaningful - steps, rankings, instructions. Description lists are the most overlooked and are specifically for term-description pairs like glossaries, FAQs, and product specs. All three use li for items except description lists which use dt for terms and dd for descriptions.

html
1<!-- Unordered list (bulleted) -->
2<ul>
3    <li>Item 1</li>
4    <li>Item 2</li>
5    <li>Item 3</li>
6</ul>
7
8<!-- Ordered list (numbered) -->
9<ol>
10    <li>First item</li>
11    <li>Second item</li>
12    <li>Third item</li>
13</ol>
14
15<!-- Description list -->
16<dl>
17    <dt>HTML</dt>
18    <dd>HyperText Markup Language</dd>
19    
20    <dt>CSS</dt>
21    <dd>Cascading Style Sheets</dd>
22</dl>

HTML Tables

Tables are for tabular data - information with genuine row-and-column relationships where both directions of the grid carry meaning. They got a bad reputation from years of being misused for page layout, which is now done with CSS flexbox and grid, but for actual data like comparison charts, schedules, and specifications they're the right tool. The basic structure is table wrapping tr rows, with th for header cells and td for data cells.

html
1<!-- Basic table structure -->
2<table border="1">
3    <tr>
4        <th>Name</th>
5        <th>Age</th>
6        <th>Country</th>
7    </tr>
8    <tr>
9        <td>John</td>
10        <td>25</td>
11        <td>USA</td>
12    </tr>
13    <tr>
14        <td>Maria</td>
15        <td>32</td>
16        <td>Spain</td>
17    </tr>
18</table>
19
20<!-- Table with colspan and rowspan -->
21<table border="1">
22    <tr>
23        <th colspan="2">Name</th>
24        <th>Age</th>
25    </tr>
26    <tr>
27        <td>John</td>
28        <td>Doe</td>
29        <td rowspan="2">25</td>
30    </tr>
31    <tr>
32        <td>Jane</td>
33        <td>Smith</td>
34    </tr>
35</table>

HTML Forms

Forms are how websites collect user input - login fields, search boxes, registrations, checkouts. The three elements that do most of the work are form (the container that knows where to send data), input (the field that collects it), and label (the text that tells users what each field is for). Every input should have a connected label via matching for and id attributes - skipping labels breaks screen reader accessibility and removes the expanded click target on checkboxes and radio buttons.

html
1<!-- Basic form structure -->
2<form action="/submit-form" method="POST">
3    <label for="name">Name:</label>
4    <input type="text" id="name" name="name" required>
5    
6    <label for="email">Email:</label>
7    <input type="email" id="email" name="email" required>
8    
9    <label for="message">Message:</label>
10    <textarea id="message" name="message" rows="4"></textarea>
11    
12    <label for="country">Country:</label>
13    <select id="country" name="country">
14        <option value="usa">USA</option>
15        <option value="canada">Canada</option>
16        <option value="uk">UK</option>
17    </select>
18    
19    <button type="submit">Send Message</button>
20</form>

Semantic HTML

Semantic HTML means using elements that describe what the content is, not just how to display it. Before HTML5 most pages were built with div elements everywhere - a coworker once described my early code as a div lasagna, which was accurate. HTML5 gave us header, nav, main, article, section, aside, and footer, which tell browsers, screen readers, and search engines what role each chunk of content plays. The visual difference for a sighted user is zero, but the accessibility and SEO benefit is real.

html
1<!-- Semantic HTML structure -->
2<header>
3    <h1>Website Title</h1>
4    <nav>
5        <ul>
6            <li><a href="/">Home</a></li>
7            <li><a href="/about">About</a></li>
8            <li><a href="/contact">Contact</a></li>
9        </ul>
10    </nav>
11</header>
12
13<main>
14    <article>
15        <header>
16            <h2>Article Title</h2>
17            <p>Published on <time datetime="2023-05-15">May 15, 2023</time></p>
18        </header>
19        
20        <section>
21            <h3>Introduction</h3>
22            <p>Article content goes here...</p>
23        </section>
24        
25        <aside>
26            <h4>Related Content</h4>
27            <p>Additional information related to the article</p>
28        </aside>
29    </article>
30</main>
31
32<footer>
33    <p>&copy; 2023 Company Name. All rights reserved.</p>
34</footer>

HTML5 Features

HTML5 ended the era of needing Flash for multimedia - video, audio, and canvas are all native now and work without plugins. The input types added in HTML5 are underused: type email validates format and shows the email keyboard on mobile, type date gives a datepicker, type range gives a slider - all free without JavaScript. Canvas is the one that surprises people the most when they realize it can be drawn on with JavaScript to build games and charts entirely in the browser.

html
1<!-- HTML5 audio element -->
2<audio controls>
3    <source src="audio.mp3" type="audio/mpeg">
4    Your browser does not support the audio element.
5</audio>
6
7<!-- HTML5 video element -->
8<video width="320" height="240" controls>
9    <source src="movie.mp4" type="video/mp4">
10    Your browser does not support the video tag.
11</video>
12
13<!-- HTML5 canvas element -->
14<canvas id="myCanvas" width="200" height="100"></canvas>
15
16<!-- HTML5 form input types -->
17<input type="email" placeholder="Enter your email">
18<input type="date">
19<input type="color" value="#ff0000">
20<input type="range" min="0" max="100" value="50">

Validation and Best Practices

Validating HTML means running it through a tool like the W3C validator to find structural errors. Chrome is very forgiving and silently fixes a lot of broken HTML, which makes validation feel unnecessary - but other browsers and screen readers don't always make the same fixes. The habits that matter most: always include the full head with charset and viewport meta tags, use semantic elements instead of divs where something more specific fits, write descriptive alt text on every image, put scripts at the bottom of the body or use the defer attribute, and write title tags carefully since they directly affect search click-through rates.

html
1<!-- Example of well-structured HTML -->
2<!DOCTYPE html>
3<html lang="en">
4<head>
5    <meta charset="UTF-8">
6    <meta name="viewport" content="width=device-width, initial-scale=1.0">
7    <meta name="description" content="Free HTML tutorial for beginners">
8    <title>Complete HTML Tutorial</title>
9    <link rel="stylesheet" href="styles.css">
10</head>
11<body>
12    <header>
13        <h1>HTML Tutorial</h1>
14    </header>
15    
16    <main>
17        <article>
18            <h2>Introduction to HTML</h2>
19            <p>HTML is the standard markup language for creating web pages.</p>
20        </article>
21    </main>
22    
23    <footer>
24        <p>&copy; 2023 WebDev Education</p>
25    </footer>
26    
27    <script src="script.js"></script>
28</body>
29</html>

Frequently Asked Questions

What is the difference between HTML and XHTML?

XHTML is a stricter XML-based version of HTML that requires every tag to be closed, attributes to be quoted, and element names to be lowercase. HTML5 is more forgiving about syntax errors and is what virtually everyone writes today. XHTML was the direction the web was heading in the early 2000s but HTML5 won, and XHTML is rarely written now except in legacy codebases.

Why should I use semantic HTML?

Semantic HTML tells browsers, screen readers, and search engines what your content means rather than just how to display it. A div and an article element both hold content but the article communicates to Google that it's a standalone piece worth indexing, and tells screen reader users they're in an article they can navigate. It also makes your own code much easier to read when you return to it later.

What are void elements in HTML?

Void elements are HTML elements that can't contain content and don't need or have a closing tag. The ones you'll use constantly are img, br, hr, input, meta, and link. Writing a closing tag on a void element is technically invalid HTML - browsers will usually handle it but it's worth getting right.

How do I make my HTML accessible?

The highest-impact things: write descriptive alt text on every image, use semantic elements instead of generic divs, connect every label to its form input with matching for and id attributes, make sure all functionality is reachable by keyboard, and test with an actual screen reader. ARIA attributes handle more complex interactive components but for most standard page content, correct semantic HTML gets you most of the way there.

What is the purpose of the DOCTYPE declaration?

DOCTYPE html tells the browser to render in standards mode rather than quirks mode. Quirks mode is a legacy compatibility setting that mimics old Internet Explorer behavior - CSS box models work differently, and the same code produces different results in different browsers. One line at the top of every file prevents all of that.

Can I create an HTML file without the full structure?

Browsers will try to render content even without proper html, head, and body structure - this forgiving behavior is intentional. But the page is invalid HTML and may render differently across browsers, have poor SEO without proper title and meta tags, and fail accessibility checks. Always use the full structure. It costs nothing and prevents a category of subtle problems.

What happens if I put a visible element like a p tag inside the head?

The browser won't render it as visible content. The head section is only for metadata. Elements meant for display that end up in the head are either ignored or cause the browser to implicitly close the head early and start the body, which produces unpredictable behavior.

Why is JavaScript usually placed at the bottom of the body?

Browsers process HTML top to bottom. A script tag in the head means the script downloads and runs before any visible content loads, which makes the page feel slow. Putting scripts just before the closing body tag lets all the visible content load first. The defer attribute on a script tag in the head achieves the same result without moving the script.

Do I need to close all HTML tags?

Most tags need to be closed - p, div, a, h1 through h6, and so on. The exceptions are void elements like img, br, hr, input, meta, and link, which have no content to wrap and don't need closing tags. Browsers often fix unclosed regular tags but it's invalid HTML and the browser's guesses aren't always what you intended.

What text editor should I use to write HTML?

VS Code is the most widely used and is free. It has HTML syntax highlighting, tag auto-completion, bracket matching, and an integrated terminal. The Live Server extension shows changes in the browser instantly, and Prettier handles auto-formatting. Notepad technically works but you'll be much faster in a proper code editor.