Who is this guide for?

This guide is designed for beginner-level users and takes about 2 minutes to read.

How-To Beginner 2 min read 310 words

Encoding Explained: UTF-8, ASCII, Base64, and URL Encoding

Understand character encodings, binary-to-text encoding, and URL encoding to prevent data corruption and bugs.

Featured Tool

Hash Generator

Generate SHA-1, SHA-256, SHA-384, SHA-512 hashes from text

Try it Free

Text and Data Encoding

Encoding confusion causes garbled text, broken URLs, corrupted data, and security vulnerabilities. Understanding the purpose of each encoding type prevents these issues.

Character Encoding: ASCII and UTF-8

ASCII maps 128 characters (English letters, digits, punctuation) to numbers 0-127, using 7 bits per character. UTF-8 extends this to support every Unicode character (148,000+) using 1-4 bytes. ASCII text is valid UTF-8. The reverse is not true — UTF-8 text containing non-ASCII characters is not valid ASCII. Always use UTF-8 for new projects.

UTF-8 vs UTF-16 vs UTF-32

UTF-8 uses variable-width encoding (1-4 bytes): efficient for ASCII-heavy text (English), less so for CJK characters (3 bytes each). UTF-16 uses 2 or 4 bytes: efficient for CJK text, wasteful for ASCII. UTF-32 uses exactly 4 bytes per character: simplest to process but wasteful of space. Web standard: UTF-8. Windows internals: UTF-16. Database analysis: UTF-32.

Base64 Encoding

Base64 converts binary data to ASCII text using 64 characters (A-Z, a-z, 0-9, +, /). It's used to embed binary data in text-only contexts: email attachments (MIME), data URIs in HTML/CSS, and JWT payloads. Base64 increases data size by approximately 33%. Base64url variant replaces + with - and / with _ for URL safety.

URL Encoding (Percent Encoding)

Special characters in URLs are encoded as %XX where XX is the hex value: space becomes %20, & becomes %26. This prevents special characters from being interpreted as URL syntax. Over-encoding (encoding characters that don't need it) is harmless but makes URLs ugly. Under-encoding causes parsing errors and potential security issues.

Common Encoding Bugs

Mojibake (garbled text) means the encoding was misidentified — UTF-8 bytes interpreted as Latin-1, or vice versa. Double encoding (%2520 instead of %20) means the data was URL-encoded twice. Base64 "padding" errors (invalid length, missing = signs) indicate the encoded data was truncated during transmission.

Outils associés

H Hash Generator P Password Generator U Unix Timestamp Converter C Cron Expression Generator C Chmod Calculator S String Escape / Unescape I IP Subnet Calculator C Color Code Converter C CSV ↔ JSON Converter X XML ↔ JSON Converter S SQL Formatter M Markdown Table Generator H HTTP Status Code Reference M Meta Tags Generator R Robots.txt Generator . .gitignore Generator H HTML Formatter C CSS Unit Converter J JSONPath Evaluator T Text Diff Checker D Data URI Converter L Lorem Ipsum Generator P Path Converter . .htaccess Generator . .env Validator P Placeholder Image Generator

Formats associés

.css .csv .html .json .md .txt .xml .yaml

Guides associés

JSON vs YAML vs TOML: Choosing a Configuration Format

Configuration files are the backbone of modern applications. JSON, YAML, and TOML each offer different trade-offs between readability, complexity, and tooling support that affect your development workflow.

How to Format and Validate JSON Data

Malformed JSON causes silent failures in APIs and configuration files. Learn how to format, validate, and debug JSON documents to prevent integration errors and improve readability.

Base64 Encoding: How It Works and When to Use It

Base64 converts binary data into ASCII text, making it safe for transmission through text-based systems. Learn when Base64 is the right choice and when alternatives like hex encoding or URL encoding are more appropriate.

Best Practices for Working with Unix Timestamps

Unix timestamps provide a language-agnostic way to represent points in time, but they come with pitfalls around time zones, precision, and the 2038 problem. This guide covers best practices for storing and converting timestamps.

Troubleshooting JWT Token Issues

JSON Web Tokens are widely used for authentication but can be frustrating to debug. This guide covers common JWT problems including expiration errors, signature mismatches, and payload decoding issues.