Encoding issues on OpenAI predictions after fine-tuning

I'm following this OpenAI tutorial about fine-tuning.

I already generated the dataset with the openai tool. The problem is that the outputs encoding (inference result) is mixing UTF-8 with non UTF-8 characters.

The generated model looks like this:

{"prompt":"Usuario: Quién eres\\nAsistente:","completion":" Soy un Asistente\n"}{"prompt":"Usuario: Qué puedes hacer\\nAsistente:","completion":" Ayudarte con cualquier gestión o ofrecerte información sobre tu cuenta\n"}

For instance, if I ask "¿Cómo estás?" and there's a trained completion for that sentence: "Estoy bien, ¿y tú?", the inference often returns exactly the same (which is good), but sometimes it adds non-encoded words: "Estoy bien, ¿y tú? CuÃ©ntame algo de ti", adding "Ã©" instead of "é".

Sometimes, it returns exactly the same sentence that was trained for, with no encoding issues.I don't know if the inference is taking the non-encoded characters from my model or from somewhere else.

What should I do?Should I encode the dataset in UTF-8?Should I leave the dataset with UTF-8 and decode the bad encoded chars in the response?

The OpenAI docs for fine-tuning don't include anything about encoding.

Encoding issues on OpenAI predictions after fine-tuning

Trending Articles

SHA FM SINDU KAMARE WITH EMBILIPITIYA DELIGHTED 2018-06-22

Too Short-Gettin It Album Number Ten-CD-FLAC-1996-Mrflac

Former Waltham man, 30, jailed for eight-and-a-half years for raping four women

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Gulabi kallu Lyrics and translation | GAV / Govindhudu andhari vadele (2014)

Umapathy Hanumanthappa (reply)

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Raj Panchayat 3rd / Third Grade Teacher Revised Result 2012 Level 1-2...

Download: Promise – By Fire (Prod By J Kabs)

How to Configure Data Captures for Intermittent/Sporadic SChannel Events

Isilon CLI Command Reference

Practice Sheet of Right form of verbs for HSC Students

Maya Mohini 10-10-2016 – Vijay tv Serial

Elle Duncan’s Husband Omar Abdul Ali

State Champs – Living Proof (2018) [FLAC 24bit/44,1kHz]

Neem Baba Extra Questions Answer Class 6 English Poorvi

Advertisement Writing Class 12 Format, Examples

Gemvision Matrix 9.0 7349 Full crack + Rhinoceros 5.14 + Clayoo 2.5.18071.9

Foreigner found dead in Kg Sungai Teraban area

Bureau of Internal Revenue: Regional Offices (Directory)