问

UnicodeDecodeError:'utf8'编解码器无法解码位置0中的字节0xa5:无效的起始字节

6毛群--yuki 发布于 2023-01-19 17:38

python

我正在使用Python-2.6 CGI脚本,但在服务器日志中发现此错误json.dumps(),

Traceback (most recent call last):
  File "/etc/mongodb/server/cgi-bin/getstats.py", line 135, in 
    print json.dumps(??__get?data())
  File "/usr/lib/python2.7/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/lib/python2.7/json/encoder.py", line 201, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/lib/python2.7/json/encoder.py", line 264, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xa5 in position 0: invalid start byte

在这里,

?__get?data()功能返回dictionary {}.

在发布这个问题之前,我已经提到了这个问题.

更新

以下行是伤害JSON编码器,

now = datetime.datetime.now()
now = datetime.datetime.strftime(now, '%Y-%m-%dT%H:%M:%S.%fZ')
print json.dumps({'current_time': now}) // this is the culprit

我得到了临时解决方案

print json.dumps( {'old_time': now.encode('ISO-8859-1').strip() })

但我不确定这是否正确.

9 个回答

受到aaronpenne和Soumyaansh的启发
```
f    = open("file.txt","rb")
text = f.read().decode(errors='replace')
```
2023-01-19 17:40 回答

疯子晨晨农_481
我只是通过在read_csv()命令中定义一个不同的编解码器包来切换它
```
encoding = 'unicode_escape'
```
2023-01-19 17:40 回答

您的字符串中包含非ascii字符.

如果您需要在代码中使用其他编码,则无法使用utf-8进行解码.例如:

>>> 'my weird character \x96'.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x96 in position 19: invalid start byte

在这种情况下,编码是windows-1252所以你必须这样做:

>>> 'my weird character \x96'.decode('windows-1252')
u'my weird character \u2013'

现在您已经拥有了unicode,您可以安全地编码为utf-8.

2023-01-19 17:40 回答

六尾11

从2018-05开始decode，至少在Python 3中可以直接使用它来处理。

获取invalid start byte并invalid continuation byte输入错误后，我正在使用以下代码段。添加errors='ignore'为我修复。
```
with open(out_file, 'rb') as f:
    for line in f:
        print(line.decode(errors='ignore'))
```
2023-01-19 17:40 回答

手机用户2502902237

在读取csv时，我添加了一种编码方法：

import pandas as pd
dataset = pd.read_csv('sample_data.csv',header=0,encoding = 'unicode_escape')

2023-01-19 17:40 回答

EEeeen_

该错误是因为字典中存在一些非ascii字符,并且无法对其进行编码/解码.避免此错误的一种简单方法是使用以下encode()函数对此类字符串进行编码(如果a是具有非ascii字符的字符串):
```
a.encode('utf-8').strip()
```
2023-01-19 17:40 回答

诸葛烈火_220
请尝试以下代码段:
```
with open(path, 'rb') as f:
  text = f.read()
```
2023-01-19 17:40 回答

栋逼逼丶
在代码顶部设置默认编码器
```
import sys
reload(sys)
sys.setdefaultencoding("ISO-8859-1")
```
2023-01-19 17:40 回答

拍友2602890695

下一行损害了JSON编码器，

now = datetime.datetime.now()
now = datetime.datetime.strftime(now, '%Y-%m-%dT%H:%M:%S.%fZ')
print json.dumps({'current_time': now}) // this is the culprit

我有一个临时解决方案

print json.dumps( {'old_time': now.encode('ISO-8859-1').strip() })

将其标记为正确（作为临时解决方案）（不确定）。

2023-01-19 17:41 回答

手机用户2502928693

撰写答案

今天，你开发时遇到什么问题呢？

立即提问

热门标签