来自什么字符集é?在 Windows 记事本中,在 ANSI 文本文件中包含此字符可以很好地保存。插入类似的东西,你会得到一个错误。é似乎在 Putty 的 ASCII 终端中工作正常(CP437 和 IBM437 是否相同?)而没有。
我可以看到这是 Unicode,而不是 ASCII。但什么是é?它不会给出我在记事本中使用 Unicode 时遇到的错误,但是SyntaxError: Non-ASCII character '\xc3' in file on line , but no encoding declared;在我添加 Python NLTK 所建议的“魔术注释”之前,Python 抛出了:SyntaxError: Non-ASCII character '\xc3' in file (Sentiment Analysis -NLP)。
我添加了“魔术注释”并且没有收到该错误,但是 os.path.isfile() 说文件名é不存在。具有讽刺意味的是,该字符é位于Marc-André Lemburg错误链接到的 PEP 的作者中。
编辑:如果我打印文件的路径,重音 e 显示为,├⌐但我可以复制并粘贴é到命令提示符中。
EDIT2:见下文
Private    > cat scratch.py   ### LOL cat scratch :3
# coding=utf-8
file_name = r"Filéname"
file_name = unicode(file_name)
Private    > python scratch.py
Traceback (most recent call last):
  File "scratch.py", line 3, in <module>
    file_name = unicode(file_name)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3: ordinal not in range(128)
Private    >
编辑3:
Private    > PS1="Private    > " ; echo code below ; cat scratch.py ; echo =======  ; echo output below ; python scratch.py
code below
# -*- coding: utf-8 -*-
file_name = r"Filéname"
file_name = unicode(file_name, encoding="utf-8")
# I have code here to determine a path depending on the hostname of the
# machine, the folder paths contain no Unicode characters, for my debug
# version of the script, I will hardcode the redacted hostname.
hostname = "One"
if hostname == "One":
    folder = "C:/path/folder_one"
elif hostname == "Two":
    folder = "C:/path/folder_two"
else:
    folder = "C:/path/folder_three"
path = "%s/%s" % (folder, file_name)
path = unicode(path, encoding="utf-8")
print path
=======
output below
Traceback (most recent call last):
  File "scratch.py", line 18, in <module>
    path = unicode(path, encoding="utf-8")
TypeError: decoding Unicode is not supported
Private    >