I'm new here. I use utf-8-sig
to read a CSV file with Chinese characters which still results in garbled text. After using chardet to detect the encoding, it suggests using utf-8-sig
, and I have. confirmed that the file is formatted with BOM in UTF-8.Here is my code:
csv_data = pd.read_csv(file_path, encoding='utf-8-sig') print(csv_data.head())
and it prints
�����W�����} ����έp�_�l����έp�������0 0857�T�֫� www.0857.games 1 2023/12/12 2023/12/181 0x www.0xdappplus.com 2 2023/4/25 2023/4/30
and my data in csv
網站名稱,網址,件數,統計起始日期,統計結束日期0857娛樂城,www.0857.games,1,2023/12/12,2023/12/180x,www.0xdappplus.com,2,2023/4/25,2023/4/30
"I have already tried chardet
and other Chinese-related encodings like big5
, gbk
, and so on."