I'm trying to read .csv file that contains utf-8 data in some of its columns. The method of reading is by using pandas dataframe. The code is as following:
df = pd.read_csv('Cancer_training.csv', encoding='utf-8')
Then I got the following examples of errors with different files:
(1) 'utf-8' codec can't decode byte 0xcf in position 14:invalid continuation byte
(2) 'utf-8' codec can't decode byte 0xc9 in position 3:invalid continuation byte
Could you please share your ideas and experience with such problem? Thank you.
[python: 3.4.1.final.0, pandas: 0.14.1]
sample of the raw data, I cannot put full record because of the legal restrictions of the medical data: