base64编码
<p>[TOC]</p>
<h1>🌓介绍</h1>
<p>Base64是一种基于64个可打印字符来表示二进制数据的表示方法,每6bit为一个单元,对应某个可打印字符,3个字节24bit相当于4个base64单元
可打印字符有一个对应的码表
<img src="https://pic1.imgdb.cn/item/636fcea616f2c2beb1b22488.png" alt="" />
接下来看一下具体的变换过程</p>
<h2>🌙三个字节</h2>
<p>拿“ABC”来举个例子,首先需要将这三个字符转换为二进制数据,然后对二进制重新划分
<img src="https://pic1.imgdb.cn/item/636fcecc16f2c2beb1b26404.png" alt="" />
需要注意的是转二进制的时候左侧的“0”需要补齐至8位才可以,重新划分后的二进制对照码表可以转换为
<img src="https://pic1.imgdb.cn/item/636fceed16f2c2beb1b28b4e.png" alt="" />
<img src="https://pic1.imgdb.cn/item/636fcef816f2c2beb1b298f4.png" alt="" />
对比结果是对的</p>
<h2>🌙两个字节</h2>
<p>两个字节是不满24位的,也就没法划分,两个字节是16位,将其划分为6位一组的话能划分为2组多4位,需要对最后这4位进行补齐至8位,最后总共就能有三组,但是最后Base64一定为4个字符,不满4个字符就用“=”进行填充,拿“AB”来举例
<img src="https://pic1.imgdb.cn/item/636fcf3016f2c2beb1b2e4fd.png" alt="" />
<img src="https://pic1.imgdb.cn/item/636fcf3b16f2c2beb1b2f465.png" alt="" />
结果也是正确的</p>
<h2>🌙一个字节</h2>
<p>一个字节的情况和两个字节的情况很像,一个字节只有8位,只能划分1组,多余2位,补齐4个0即可获取两组,最后用“=”补齐至4个字符,拿“A”来举例
<img src="https://pic1.imgdb.cn/item/636fcf6216f2c2beb1b32b0d.png" alt="" />
<img src="https://pic1.imgdb.cn/item/636fcf7e16f2c2beb1b35259.png" alt="" /></p>
<h1>🌓Coding</h1>
<p>讲一讲大概的伪代码实现</p>
<pre><code>string /* 待编码字符串 */
enc = "" /* 编码后字符串 */
ascii = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/" /* 码表 */
size = strlng.len
for(i = 0; i < size; i+=3)
{
num[0] = string[i]>>2
enc += ascii[num[0]]
if(i+1 > size) break
if(i+1 == size) string += "\x00"
num[1] = ((string[i] & 3) << 4) | (string[i+1] >> 4)
enc += ascii[num[1]]
if(i+2 > size) break;
if(i+2 == size) string += "\x00"
num[2] = ((string[i+1] & 15) << 2) | (string[i+2] >> 6)
enc += ascii[num[2]]
if(i+2 == size) break;
num[3] = string[i+2] & 63
enc += ascii[num[3]]
}
if(enc.len%4)
{
enc += '='*(4 - enc.len%4)
}</code></pre>
<p>解码的伪代码为</p>
<pre><code>enc
string = ""
ascii = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
size = enc.len
for(i = 0; i < size; i+=4)
{
num[0] = ascii.index(enc[i])
num[1] = ascii.index(enc[i+1])
string += (num[0] << 2) | (num[1] >> 4)
if(enc[i+2] == '=') break
num[2] = ascii.index(enc[i+2])
string += ((num[1] & 15) << 4) | (num[2] >> 2)
if(enc[i+3] == '=') break
num[3] = ascii.index[enc[i+3])
string += ((num[2] & 3) << 6) | num[3]
}</code></pre>
<h2>🌙Python实现</h2>
<pre><code class="language-py">string = input("Input: ")
ascii = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
enc = ""
size = len(string)
for i in range(0, size, 3):
num0 = ord(string[i])>>2
enc += ascii[num0]
if i+1 > size:
break
if i+1 == size:
string += '\x00'
num1 = ((ord(string[i]) & 3) << 4) | (ord(string[i+1]) >> 4)
enc += ascii[num1]
if i+2 > size:
break
if i+2 == size:
string += '\x00'
num2 = ((ord(string[i+1]) & 15) << 2) | (ord(string[i+2]) >> 6)
enc += ascii[num2]
if i+2 == size:
break
num3 = ord(string[i+2]) & 63
enc += ascii[num3]
if len(enc)%4:
enc += '='*(4 - len(enc)%4)
print(enc)</code></pre>
<p><img src="https://pic1.imgdb.cn/item/636fcffc16f2c2beb1b40175.png" alt="" /></p>
<pre><code class="language-py">enc = input("Input: ")
ascii = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/"
string = ""
size = len(enc)
for i in range(0, size, 4):
num0 = ascii.index(enc[i])
num1 = ascii.index(enc[i+1])
string += chr((num0 << 2) | (num1 >> 4))
if enc[i+2] == '=':
break
num2 = ascii.index(enc[i+2])
string += chr(((num1 & 15) << 4) | (num2 >> 2))
if enc[i+3] == '=':
break
num3 = ascii.index(enc[i+3])
string += chr(((num2 & 3) << 6) | num3)
print(string)</code></pre>
<p><img src="https://pic1.imgdb.cn/item/636fd01916f2c2beb1b423b2.png" alt="" /></p>