Special Characters and HTML Encoding Standards
Sometimes, you need to display characters that have special meanings in HTML syntax (like < or >), or characters that do not exist on a standard keyboard (like copyright symbols © or arrows). To handle this, HTML uses Character Entities.
1. Why Character Entities?
If you write <p>This is a <p> tag.</p> inside an HTML document, the browser will get confused because it interprets the second <p> as the start of a new nested paragraph tag rather than plain text.
To fix this, you must replace reserved characters with their respective character entity names or numbers.
- Format: All entities start with an ampersand (
&) and end with a semicolon (;).
2. Common HTML Character Entities
Here are the most frequently used entities in web development:
| Real Character | Entity Name | Description |
< | < | Less Than (Opening Bracket) |
> | > | Greater Than (Closing Bracket) |
& | & | Ampersand |
" | " | Double Quotation Mark |
' | ' | Single Quotation Mark |
(space) | | Non-Breaking Space |
© | © | Copyright |
® | ® | Registered Trademark |
™ | ™ | Trademark |
¥ | ¥ | Yen Currency Symbol |
Non-Breaking Space ( )
Browsers automatically collapse multiple spaces inside your HTML code into a single space. If you want to force multiple spaces, use :
<p>Space between words.</p>(Tip: Avoid overusing for layout spacing. Use CSS padding and margin instead!)
3. Character Encoding Standards: UTF-8
Computers store characters as binary numbers. To tell the browser how to translate binary numbers back into human-readable characters, you must specify a Character Set (Charset).
In the past, there were many conflicting charsets, leading to broken character rendering (known as "Mojibake"). Today, the universal standard is UTF-8.
Why UTF-8?
- Covers almost all characters, symbols, emojis, and languages in the world.
- Backwards compatible with ASCII.
- Maximizes compatibility across international web servers and users.
How to set UTF-8:
Place this <meta> tag inside your page <head> block:
<head>
<meta charset="UTF-8">
</head>Make sure this is the very first tag inside the <head>, ensuring the browser parses characters correctly before rendering any content.
Congratulations! You have completed the HTML Basics chapter. You now know how to structure pages, work with text, format lists, embed media, and link pages together.
In the next chapter, we will learn about Forms & Inputs to handle interactive user submissions.