問題のコード
# coding: utf-8 lines = [] ''' sample.txt ファイルの文字コードは UTF-8 ''' with open('sample.txt', 'rt') as f: for text in f.readlines(): text = text.rstrip() lines.append(text.split(' ', 1)) import csv with open('output.csv', 'w', newline='') as f: cw = csv.writer(f, delimiter=',', quotechar='"') for line in lines: cw.writerow(line)
実行結果
Traceback (most recent call last): File "main.py", line 6, in <module> for text in f.readlines(): UnicodeDecodeError: 'cp932' codec can't decode byte 0xef in position 31: illegal multibyte sequence
解決策
# coding: utf-8 lines = [] # sample.txt をバイナリモードで開く with open('sample.txt', 'rb') as f: for text in f.readlines(): text = text.decode().rstrip() # 当然読み込まれるデータはバイナリなので decode する lines.append(text.split(' ', 1)) import csv with open('output.csv', 'w', newline='') as f: cw = csv.writer(f, delimiter=',', quotechar='"') for line in lines: cw.writerow(line)
何だこれ
何なんでしょうね……