HTML Tutorial

Introduction to HTML

HTML - which stands for HyperText Markup Language - is basically the thing that every single webpage on the internet is built on, and I remember the first time I actually understood what it was doing I felt a bit stupid because it's genuinely not complicated once it clicks. It's a markup language, meaning you're just wrapping text in tags to tell the browser how to display it, kind of like those sticky tabs your teacher put on folders in primary school to tell you what's inside - the tag is just a label the browser reads and acts on. The "hyper" in HyperText just means the text can link to other text, which was a pretty revolutionary idea in 1991 and somehow still feels like the backbone of everything we do online.

What is HTML?

  • HyperText - Text that contains links to other texts, not limited to linear reading
  • Markup Language - Uses tags to annotate text within a document to define structure and layout
  • HTML Document - File with .html extension containing HTML code that browsers can render

Basic HTML Document Structure

Every HTML file has this same skeleton structure and I genuinely think learning this basic HTML structure first saves you so much pain later - I skipped this part when I was starting out and spent a week wondering why my pages looked broken in certain browsers. The whole thing starts with DOCTYPE, then your html tag, then head and body, and that's basically it. The head tag is for stuff the browser needs to know about but doesn't actually show on the page - like the title that appears on the browser tab, or character encoding, which is easy to forget and then your apostrophes start showing up as weird symbols. Body is where everything visible goes. Think of the head and body like the back of a restaurant versus the dining room - customers only see one of them but both matter.

html
1<!DOCTYPE html>
2<html lang="en">
3<head>
4    <meta charset="UTF-8">
5    <meta name="viewport" content="width=device-width, initial-scale=1.0">
6    <title>Document Title</title>
7</head>
8<body>
9    <!-- Content goes here -->
10    <h1>My First Heading</h1>
11    <p>My first paragraph.</p>
12</body>
13</html>

Key Structural Elements

  • <!DOCTYPE html> - Declares the document type and HTML version (HTML5)
  • <html> - Root element that wraps all content on the page
  • <head> - Contains meta-information about the document
  • <body> - Contains all visible content of the web page

HTML Elements and Tags

Okay so this is where people get a little tangled up and honestly I was one of them. An HTML element is the whole thing - the opening tag, the content inside, and the closing tag together - not just the tag itself, even though everyone including me says 'tag' when they mean 'element' and nobody seems to care enough to correct it. The opening tag opens the tag, and the closing tag closes the tag, and the difference is just that forward slash before the name. I once spent 45 minutes debugging a page where a paragraph wasn't closing properly because I typed <p/> instead of </p> like some kind of confused person. Then there are void elements or self-closing elements like img and br that don't have any content inside them and therefore don't need a closing tag at all - if you're wondering why images don't need closing tags, it's because there's nothing to wrap around.

html
1<!-- Element with opening and closing tags -->
2<p>This is a paragraph element</p>
3
4<!-- Self-closing element -->
5<img src="image.jpg" alt="Description">
6
7<!-- Nested elements -->
8<div>
9    <h2>Section Heading</h2>
10    <p>Paragraph inside a div</p>
11</div>

Element Structure

  • Opening Tag - Marks the beginning of an element: <tagname>
  • Closing Tag - Marks the end of an element: </tagname>
  • Content - The actual content between the opening and closing tags
  • Element - The complete set: opening tag + content + closing tag
  • Empty Elements - Elements with no content that don't require closing tags

HTML Attributes

Attributes give elements the extra information the elements need to actually do something useful - like an anchor tag is just text without an href attribute, it doesn't go anywhere, which is a bit useless for a link. They always go inside the opening tag, always come in name="value" pairs, and the order doesn't matter which I found weirdly liberating when I learned it. The ones you'll use constantly are href for links, src for images, alt for image descriptions (please use these - screen readers depend on them and also Google reads them), id for targeting specific elements, and class for grouping elements together so you can style them. I genuinely don't know why the attribute for hyperlink references is called "href" and not just "url" but apparently that's a whole history thing and the web is full of naming decisions made in the early 90s that we're all just living with now.

html
1<!-- Attributes provide additional information -->
2<a href="https://example.com" title="Visit Example">Click Here</a>
3
4<img src="photo.jpg" alt="A beautiful landscape" width="500" height="300">
5
6<div class="container" id="main-content" data-category="tutorial">
7    Content here
8</div>

Common Attributes

  • href - Specifies the URL for links
  • src - Specifies the source path for images and media
  • alt - Provides alternative text for images
  • id - Specifies a unique identifier for an element
  • class - Specifies one or more class names for an element
  • style - Specifies inline CSS styles for an element

Headings and Paragraphs

Headings in HTML go from h1 to h6 and they're not just about making text bigger - they create a document hierarchy that screen readers and search engines actually navigate through, which means if you're using h2s and h3s randomly because they look a certain size, you're doing it wrong and so was I for my first several months of building pages. The h1 is your main page title and there should really only be one of them per page, and then h2s are your major sections, h3s are sub-sections within those, and honestly most pages never need to go past h3. Paragraphs are the p tag and there's not much to say about them except they add automatic spacing above and below which is genuinely nice, and if you want to style that spacing differently you'll do it in CSS anyway. The spec actually says paragraph elements can't contain other block elements inside them, which is one of those rules that browsers will silently fix for you and you'll never know you broke it.

html
1<!-- Headings define hierarchy -->
2<h1>Main Title (Most Important)</h1>
3<h2>Subheading</h2>
4<h3>Section Heading</h3>
5<h4>Sub-section Heading</h4>
6<h5>Minor Heading</h5>
7<h6>Least Important Heading</h6>
8
9<!-- Paragraphs for text content -->
10<p>This is a paragraph of text. It can contain multiple sentences and will be displayed as a block of text with some default spacing.</p>
11
12<p>Another paragraph with more content. Paragraphs are essential for organizing textual content on web pages.</p>

Text Formatting Tags

This one trips people up because some of these tags look like they do the same thing but they technically don't. Bold and strong both make text bold visually, but strong carries semantic weight - it means the text is actually important, not just visually emphasized - and screen readers will sometimes announce strong text differently, so it's worth using the right one. Same deal with i and em - they both italicize but em means emphasis. I used b and i for everything for a long time and my code still worked fine, which is sort of the frustrating thing about HTML, it lets you get away with being a bit wrong. Sub and sup are genuinely useful for science content and footnotes, mark is that yellow highlight tag that nobody really uses but exists, and del and ins are for showing changes in a document kind of like tracked changes - I once used them to show price changes on a product page and it looked really clean actually.

html
1<!-- Text formatting examples -->
2<p>This text contains <b>bold</b>, <strong>strong</strong>, <i>italic</i>, and <em>emphasized</em> text.</p>
3
4<p>You can also <mark>highlight text</mark>, show <small>smaller text</small>, or <del>delete text</del>.</p>
5
6<p>Mathematical formulas: H<sub>2</sub>O and E = mc<sup>2</sup>.</p>
7
8<p><ins>Inserted text</ins> and <u>underlined text</u> have different semantic meanings.</p>

HTML Lists

Lists. There are three of them. Unordered lists are bulleted, ordered lists are numbered, and description lists are for term-definition pairs like a glossary. The one people forget exists is dl - the description list - and it's actually really good for FAQs, glossaries, and anywhere you have a label paired with an explanation, which comes up more than you'd think. All three use li for list items except dl which uses dt for the term and dd for the definition, and yes I still look that up sometimes. Ordered lists can be customized with a start attribute if you want them to begin from a number other than 1, which sounds niche but is useful when you split a numbered list across sections. I once made a navigation menu out of an unordered list and it felt wrong for about a week before I realized that's actually the correct semantic way to do it and basically every nav on the internet is an unstyled ul underneath

html
1<!-- Unordered list (bulleted) -->
2<ul>
3    <li>Item 1</li>
4    <li>Item 2</li>
5    <li>Item 3</li>
6</ul>
7
8<!-- Ordered list (numbered) -->
9<ol>
10    <li>First item</li>
11    <li>Second item</li>
12    <li>Third item</li>
13</ol>
14
15<!-- Description list -->
16<dl>
17    <dt>HTML</dt>
18    <dd>HyperText Markup Language</dd>
19    
20    <dt>CSS</dt>
21    <dd>Cascading Style Sheets</dd>
22</dl>

HTML Tables

Tables are for tabular data - rows and columns of related information - and not for page layout, even though that's exactly what people used them for throughout most of the 2000s, which honestly explains why so many old websites looked like spreadsheets. The basic structure is table wrapping tr elements (table rows), and inside each row you have either th for header cells or td for regular data cells. The colspan and rowspan attributes are where it gets genuinely confusing because they change how many columns or rows a cell covers, and my first attempt at a merged-cell table had 4 different colspan errors and I had to draw it on paper before it made sense. A properly structured table should also have a thead, tbody, and optionally tfoot to divide it into sections, which browsers handle fine without them but it's better for accessibility and makes the code way easier to read a month later when you've forgotten what you were doing.

html
1<!-- Basic table structure -->
2<table border="1">
3    <tr>
4        <th>Name</th>
5        <th>Age</th>
6        <th>Country</th>
7    </tr>
8    <tr>
9        <td>John</td>
10        <td>25</td>
11        <td>USA</td>
12    </tr>
13    <tr>
14        <td>Maria</td>
15        <td>32</td>
16        <td>Spain</td>
17    </tr>
18</table>
19
20<!-- Table with colspan and rowspan -->
21<table border="1">
22    <tr>
23        <th colspan="2">Name</th>
24        <th>Age</th>
25    </tr>
26    <tr>
27        <td>John</td>
28        <td>Doe</td>
29        <td rowspan="2">25</td>
30    </tr>
31    <tr>
32        <td>Jane</td>
33        <td>Smith</td>
34    </tr>
35</table>

HTML Forms

Forms collect user input. There are a lot of tags involved and it can feel overwhelming but you'll end up using the same handful of them on basically every project. The form tag wraps everything and has an action (where the data goes) and a method (usually GET or POST), and then inside you've got input for text fields, email fields, checkboxes and radio buttons, textarea for multi-line text, select and option for dropdowns, and a submit button which can be an input type submit or a regular button element. The label tag connects to its input via the for attribute matching the input's id, which matters for accessibility because clicking the label then focuses the input - sounds small but users actually rely on this, especially on mobile. I used to skip labels when I was learning and just put text next to the inputs and my forms technically worked but were kind of a nightmare to use. The required attribute does basic client-side validation for free which is nice, but you still need server-side validation too because anyone who knows what they're doing can just bypass browser validation

html
1<!-- Basic form structure -->
2<form action="/submit-form" method="POST">
3    <label for="name">Name:</label>
4    <input type="text" id="name" name="name" required>
5    
6    <label for="email">Email:</label>
7    <input type="email" id="email" name="email" required>
8    
9    <label for="message">Message:</label>
10    <textarea id="message" name="message" rows="4"></textarea>
11    
12    <label for="country">Country:</label>
13    <select id="country" name="country">
14        <option value="usa">USA</option>
15        <option value="canada">Canada</option>
16        <option value="uk">UK</option>
17    </select>
18    
19    <input type="submit" value="Send Message">
20</form>

Semantic HTML

Semantic HTML means using tags that actually describe what the content is, not just what it looks like, and this is one of those things that sounds fussy until you realize how much it matters for accessibility and search engines reading your page. Before HTML5 most people just slapped everything in divs - I had a coworker once who called my code a "div lasagna" because it was just divs inside divs inside divs with no meaning attached to any of it. Now we have header, nav, main, article, section, aside, and footer, all of which tell the browser and assistive technology what role each chunk of content plays on the page. The difference between article and section trips people up: an article should make sense on its own if you lifted it out of the page (like a blog post or a news story), while a section is just a thematic grouping within the page. Aside is for content that's related to the main content but not central to it, like a sidebar or a callout box, and it's one of those tags that when you start using it correctly you wonder how you went without it

html
1<!-- Semantic HTML structure -->
2<header>
3    <h1>Website Title</h1>
4    <nav>
5        <ul>
6            <li><a href="/">Home</a></li>
7            <li><a href="/about">About</a></li>
8            <li><a href="/contact">Contact</a></li>
9        </ul>
10    </nav>
11</header>
12
13<main>
14    <article>
15        <header>
16            <h2>Article Title</h2>
17            <p>Published on <time datetime="2023-05-15">May 15, 2023</time></p>
18        </header>
19        
20        <section>
21            <h3>Introduction</h3>
22            <p>Article content goes here...</p>
23        </section>
24        
25        <aside>
26            <h4>Related Content</h4>
27            <p>Additional information related to the article</p>
28        </aside>
29    </article>
30</main>
31
32<footer>
33    <p>&copy; 2023 Company Name. All rights reserved.</p>
34</footer>

HTML5 Features

Okay HTML5 is genuinely cool and I think it doesn't get talked about enough considering how much it changed what you could do without reaching for a plugin or a JavaScript library. Before HTML5 you needed Flash or third-party plugins to embed video and audio, which was a whole thing, and now you just write a video tag and point it at an mp4 and it works. HTML5 came out officially in 2014 but browsers were implementing pieces of it years earlier which is very on-brand for the web. The canvas element is the one that surprises people the most - it's basically a blank rectangle you can draw on with JavaScript, and people have built games, data visualizations, and image editors with just canvas. The new input types are also genuinely useful and underused: type="email" validates email format for free, type="date" gives you a datepicker, type="color" gives you a color palette, and type="range" gives you a slider, all without writing a single line of JavaScript or importing any library, which feels like magic the first time you try it

html
1<!-- HTML5 audio element -->
2<audio controls>
3    <source src="audio.mp3" type="audio/mpeg">
4    Your browser does not support the audio element.
5</audio>
6
7<!-- HTML5 video element -->
8<video width="320" height="240" controls>
9    <source src="movie.mp4" type="video/mp4">
10    Your browser does not support the video tag.
11</video>
12
13<!-- HTML5 canvas element -->
14<canvas id="myCanvas" width="200" height="100"></canvas>
15
16<!-- HTML5 form input types -->
17<input type="email" placeholder="Enter your email">
18<input type="date" value="2023-05-15">
19<input type="color" value="#ff0000">
20<input type="range" min="0" max="100" value="50">

Validation and Best Practices

Validating your HTML means running it through something like the W3C validator to check for structural errors, and I'll be honest I didn't do this for years because my pages looked fine in Chrome so I assumed they were fine - but Chrome is incredibly forgiving and will silently fix a lot of broken HTML, whereas older browsers, some mobile browsers, and assistive technology might not, so validation catches things you'd never spot visually. Best practices for HTML for beginners basically come down to: always declare your charset (UTF-8) and viewport meta tags, use semantic elements instead of generic divs wherever something more specific fits, write descriptive alt text for every image, use lowercase for all tags and attributes (technically you don't have to but everyone does and it's more readable), and keep your JavaScript at the bottom of the body or use the defer attribute so it doesn't block the page from loading. The meta description tag in your head section is worth writing carefully because that's what shows up as the snippet under your page title in Google results - it doesn't directly affect your ranking but it does affect whether people click

html
1<!-- Example of well-structured HTML -->
2<!DOCTYPE html>
3<html lang="en">
4<head>
5    <meta charset="UTF-8">
6    <meta name="viewport" content="width=device-width, initial-scale=1.0">
7    <meta name="description" content="Free HTML tutorial for beginners">
8    <meta name="keywords" content="HTML, CSS, JavaScript, web development">
9    <meta name="author" content="WebDev Team">
10    <title>Complete HTML Tutorial</title>
11    <link rel="stylesheet" href="styles.css">
12</head>
13<body>
14    <header>
15        <h1>HTML Tutorial</h1>
16    </header>
17    
18    <main>
19        <article>
20            <h2>Introduction to HTML</h2>
21            <p>HTML is the standard markup language for creating web pages.</p>
22        </article>
23    </main>
24    
25    <footer>
26        <p>&copy; 2023 WebDev Education</p>
27    </footer>
28    
29    <script src="script.js"></script>
30</body>
31</html>

Frequently Asked Questions

What is the difference between HTML and CSS?

HTML handles structure and content - it tells the browser what's on the page. CSS handles presentation - it tells the browser what everything looks like. You write HTML to say 'this is a heading' and CSS to say 'that heading should be blue and 32 pixels tall'. They're separate files and separate concerns and keeping them that way makes both easier to change later.

What does DOCTYPE html mean?

The DOCTYPE declaration tells the browser to render the page in standards mode using HTML5. Without it browsers fall back to quirks mode, which is basically a compatibility setting that makes them behave like old browsers from the late 90s - pages can render differently and some CSS stops working correctly. Always put it on the very first line.

Why do HTML tags need to be closed?

Tags need closing tags so the browser knows where an element ends. If you open a bold tag and never close it, everything after that point becomes bold, which is one of those bugs that makes you feel like you're losing your mind. Browsers will often guess where you meant to close something but their guess isn't always right and different browsers guess differently.

What html tags don't need closing tags?

Void elements don't have closing tags because they can't contain any content. The ones you'll see constantly are img, br, hr, input, meta, and link. They're self-contained - there's nothing to wrap around, so there's no need for a closing tag. In older HTML you'd sometimes see them written with a trailing slash like <br /> but in HTML5 that slash is optional and most people just write <br>.

Why is my HTML not showing in the browser?

Most of the time it's a file path issue - the browser can't find the file, especially with images or linked stylesheets. Check that your file is actually saved with a .html extension (not .html.txt which is a text editor trap), make sure your image paths match the actual folder structure exactly including capitalization, and open the browser developer tools console to see if there are any errors being reported.

Why should I use semantic HTML?

Semantic HTML tells browsers, screen readers, and search engines what your content actually means, not just how it looks. A div with text in it and an article element with the same text look identical in a browser but one tells Google it's an article worth indexing and tells a screen reader user it's a standalone piece of content. It also makes your code way easier to read when you come back to it six months later.

Can I learn HTML in a day?

You can learn enough HTML to build a real page in a few hours - the core tags are not complicated and there aren't that many of them. What takes longer is developing the instincts for structure, knowing which semantic element fits a situation, and understanding how HTML, CSS, and JavaScript work together. But basic HTML for beginners is genuinely one of the more approachable starting points in web development.

What is the difference between HTML and XHTML?

XHTML is a stricter XML-based version of HTML that requires every tag to be properly closed, every attribute to be quoted, and all element names to be lowercase. HTML is more forgiving - browsers will fix a lot of your mistakes silently. XHTML was a direction the web was heading in the early 2000s but HTML5 basically won and XHTML isn't really what anyone's writing today.