Base64 Encoding Explained: How It Works and When to Use It
If you have ever opened an API response, an email source, or an HTML file and seen a long block of seemingly random letters, numbers, plus signs, and slashes ending in one or two equals signs, you have already met Base64. It is one of the most widely used encoding schemes on the web, and yet it is also one of the most misunderstood. This guide explains exactly what Base64 is, how the algorithm works on a byte level, why the output is always larger than the input, and when you should and should not reach for it.
What is Base64?
Base64 is a binary-to-text encoding scheme. Its job is to take arbitrary binary data — an image, a file, an encryption key, raw bytes of any kind — and represent that data using only a small, safe set of 64 printable ASCII characters. The name comes directly from that alphabet: there are 64 symbols, so each Base64 character carries exactly 6 bits of information (because 2 to the power of 6 is 64).
The crucial point to understand up front is that Base64 is not encryption and not compression. It does not hide your data and it does not make it smaller. It simply re-packages bytes so they can travel safely through systems that were designed to handle text, such as URLs, JSON fields, email headers, and HTML attributes.
Why does Base64 exist?
Many older and even modern protocols are "text-safe" but not "binary-safe." Email, for instance, was historically built to transmit 7-bit ASCII text. If you tried to paste raw binary bytes directly into an email body, certain byte values would be interpreted as control characters, line breaks, or message terminators, corrupting the data. Base64 solves this by guaranteeing that every output character is a harmless, printable symbol that no transport layer will mangle.
You will find Base64 quietly doing this job in many places:
- Embedding small images directly in HTML or CSS using
data:URLs - Encoding email attachments via the MIME standard
- Carrying binary payloads inside JSON, which only supports text values
- Storing keys, certificates, and tokens in configuration files
- The payload and header segments of a JSON Web Token (JWT)
The 64-character alphabet
Standard Base64 uses the following 64 characters, plus = as a padding character:
- Uppercase letters
A–Z(values 0–25) - Lowercase letters
a–z(values 26–51) - Digits
0–9(values 52–61) - The symbols
+and/(values 62 and 63)
Because + and / have special meaning inside URLs, there is a popular variant
called Base64URL that replaces them with - and _ and usually
drops the padding. This is the variant used inside JWTs.
How Base64 encoding works, step by step
The algorithm regroups bits. Normal bytes are 8 bits each, but Base64 characters represent 6 bits each. So Base64 takes three bytes (24 bits) of input at a time and splits those 24 bits into four groups of 6 bits, then maps each 6-bit group to a character from the alphabet.
Let us encode the three-letter word Cat.
Step 1: Convert each character to its byte value
C = 67 = 01000011 a = 97 = 01100001 t = 116 = 01110100
Step 2: Join the bits into one 24-bit stream
010000 11 0110 0001 01 110100 → 01000011 01100001 01110100
Step 3: Re-split into four 6-bit groups
010000 110110 000101 110100 16 54 5 52
Step 4: Map each value to the alphabet
16 → Q 54 → 2 5 → F 52 → 0 "Cat" → "Q2F0"
And that is the whole trick. Three input bytes always become four output characters. You can verify this
instantly with the Base64 Encoder / Decoder — paste Cat
and you will get Q2F0 back.
What is the padding (=) for?
The neat 3-bytes-to-4-characters mapping only works when the input length is a multiple of three. When
it is not, Base64 pads the final group with = characters so the output length is always a
multiple of four. The rule is simple:
- If the last block has 1 leftover byte, the output ends with
== - If the last block has 2 leftover bytes, the output ends with
= - If there are no leftover bytes, there is no padding
"A" → "QQ==" "AB" → "QUI=" "ABC" → "QUJD"
Why is Base64 about 33% larger?
This is the single most important practical fact about Base64. Every 3 bytes of input become 4 bytes of output. That is a 4:3 ratio, which means the encoded result is roughly 33% larger than the original (before counting any padding or line breaks).
| Input size | Base64 output size | Overhead |
|---|---|---|
| 3 bytes | 4 bytes | +33% |
| 1 KB | ~1.37 KB | +33% |
| 1 MB image | ~1.37 MB | +33% |
This is why embedding large images as Base64 data URLs is usually a bad idea: you inflate the page weight by a third and lose the ability to cache the image separately. For tiny icons it can be a net win because you save an HTTP request; for anything large, link to the real file instead.
Base64 is not security
Because Base64 output looks scrambled, people sometimes treat it as a way to "hide" passwords or secrets. This is a serious mistake. Base64 is completely reversible by anyone, with no key and no effort — decoding is a one-click operation. Storing a password as Base64 offers exactly zero protection.
If you need to protect data so that only authorized parties can read it, you need encryption (such as AES). If you need to verify integrity or store passwords safely, you need hashing (such as SHA-256, ideally with a salt). We compare all three in our guide on hashing vs encryption.
When should you use Base64?
Reach for Base64 when you need to move binary data through a text-only channel. Good fits include:
- Including a tiny inline image, font, or SVG in CSS to save a request
- Putting binary content inside a JSON field that must remain valid text
- Encoding credentials for HTTP Basic Auth headers
- Transporting certificates and keys in PEM-style files
Avoid Base64 when you actually want smaller data (use compression), when you want secrecy (use encryption), or when you are tempted to inline large media files into HTML.
Encoding and decoding in practice
Almost every language has Base64 built in. In JavaScript the classic browser functions are
btoa() to encode and atob() to decode, although for full Unicode and binary
safety you typically combine them with TextEncoder / TextDecoder. In Python you
use the base64 module, and in most shells the base64 command-line tool does the
job.
When you just need a quick, reliable result without writing code, these browser-based tools handle it instantly and keep your data on your own machine:
Conclusion
Base64 is a simple, elegant solution to a specific problem: representing binary data using safe, printable text. It works by regrouping bits into 6-bit chunks, it always grows the data by about a third, and it is fully reversible — which is exactly why it must never be confused with encryption. Use it to transport binary safely through text channels, keep large media out of data URLs, and reach for encryption or hashing whenever real protection is the goal.
Frequently Asked Questions
Is Base64 encryption?
No. Base64 is a reversible encoding, not encryption. Anyone can decode it instantly without a key, so it provides no security.
Why does Base64 end with one or two equals signs?
The equals signs are padding. They appear when the input length is not a multiple of three, ensuring the encoded output length is always a multiple of four.
How much larger does Base64 make data?
About 33% larger, because every 3 bytes of input become 4 characters of output.
What is the difference between Base64 and Base64URL?
Base64URL replaces the + and / characters with - and _ and usually omits padding, making it safe to place inside URLs. It is the variant used in JWTs.
Can I decode Base64 back to the original file?
Yes. Base64 is lossless and fully reversible, so decoding returns the exact original bytes. You can try it with the Base64 tool.