我正在编写一个简单的 Python CGI 脚本,它可以抓取网页并在 Web 浏览器中显示 HTML 文件(充当代理)。这是脚本:
#!/usr/bin/env python3.0
import urllib.request
site = "http://reddit.com/"
site = urllib.request.urlopen(site)
site = site.read()
site = site.decode('utf8')
print("Content-type: text/html\n\n")
print(site)
这个脚本在命令行运行时运行良好,但是当它使用网络浏览器查看它时,它会显示一个空白页面。这是我在 Apache 的 error_log 中得到的错误:
Traceback (most recent call last):
File "/home/public/projects/proxy/script.cgi", line 11, in <module>
print(site)
File "/usr/local/lib/python3.0/io.py", line 1491, in write
b = encoder.encode(s)
File "/usr/local/lib/python3.0/encodings/ascii.py", line 22, in encode
return codecs.ascii_encode(input, self.errors)[0]
UnicodeEncodeError: 'ascii' codec can't encode character '\u2019' in position 33777: ordinal not in range(128)