關於python UnicodeEncodeError: cp950 u2020 error－晨柚的部落格｜痞客邦

Jun 06 Wed 2018 13:09
關於python UnicodeEncodeError: cp950 u2020 error

close

在爬 pdf 時遇到的問題
UnicodeEncodeError: 'cp950' codec can't encode character
'\u2020' in position 0: illegal multibyte sequence

想把爬下來的pdf寫進筆記本時出問題

有許多特殊符號都需要轉編碼轉成utf8 像是一些十字架的符號(\u2020)...等
這是因為在windows環境中會預設用cp950來encode/decode 所以需要把 encoding預設成utf8(一位林網友跟我說的很感謝他)
像底下那樣因此就可以順利執行

with open('你的文件位置/1.txt', 'a',encoding = 'utf8') as f:
results = obj.get_text()
f.write(results + '\n') #f objeocct open txt

晨柚

晨柚的部落格

晨柚發表在痞客邦留言(0) 人氣()

E-mail轉寄

全站分類：不設分類
個人分類：Python
此分類上一篇：關於 diff_match_patch_python in Python Microsoft Visual C++ 14.0 is required 錯誤
此分類下一篇：關於Missing parentheses in call to 'print' 錯誤
上一篇：關於 excel 匯入 mysql
下一篇：關於Missing parentheses in call to 'print' 錯誤

歷史上的今天

2018: 關於python block問題
2018: 關於Missing parentheses in call to 'print' 錯誤
2017: 關於手機當掉這件事

留言列表

站方公告

活動快報

天海旅...

newdirect

痞客邦特別針對站上會員與天海旅行社攜手不定期推出... 看更多活動好康

我的好友

熱門文章

文章分類

最新文章

最新留言

動態訂閱

文章精選

所有文章列表

文章搜尋

新聞交換(RSS)

誰來我家

參觀人氣

本日人氣：
累積人氣：

QR Code

qrcode

POWERED BY

(登入)