How do I remove invalid characters from UTF-8 encoded file?

Explanation:

I've come across an edge case when writing my web app. I accept UTF-8 files to be uploaded, and I've got a check in place to confirm it is UTF-8 encoded (or at least the best check possible, apparently there is no silver bullet, I'm aware there are many other questions on Stack Overflow for that specific issue).

As a test, I took an ANSI encoded file and converted it to UTF-8 by both (in separate tests) converting it UTF-8 in Notepad++, and also by just decoding as UTF-8 (even though it is ANSI) on the fly in C# using Encoding.UTF.GetBytes(inputStream).

Where The Problem Arises:

Later on, I place the raw data of the file as one of the elements in an XML file. This is where the problem arises. It appears that a character has persisted from the ANSI file which (I assume) is not valid in UTF-8. When I try load the XML using the following command...

XDocument xmlSample = XDocument.Load(outputPath);

I get this exception...

{"Invalid character in the given encoding. Line 10, position 14."}

Which looks like this in Visual Studio...

And like this in Notepad++...

Below is the character copy and pasted.

From NPP: ¡ From Visual Studio String Viewer: �

Question:

How can I remove invalid characters from UTF-8 encoded file, or at least discover them in a sane way so I can reject the file?

How do I remove invalid characters from UTF-8 encoded file?

Trending Articles

Bath man appears in court charged with attempted murder of a man...

MACLEAN, Allan

Black Angus Grilled Artichokes

Practice Sheet of Right form of verbs for HSC Students

Police blotter for Jan. 12

99 God Status for Whatsapp, Facebook

Rajasthan Board 12th Science Result 2018 name wise- RBSE 12th commerce result...

Notorious Naushad of Ippa gang nabbed

Child Kidnapping: Amy McNeil was kidnapped on her way to school by 5 adults;...

Sonible Smartlimit v1.1.5-R2R

NCERT Solutions for Class 9th Sanskrit Chapter 3 पाथेयम्

मतलबी दोस्त स्टेट्स | Matlabi Dost Status in Hindi – Selfish Friends Status

Arrow Flash 2 – Sinhala Dubbed – Episode 23 – 20th March 2016

[GET] AI Traffic Goldmine

[E² Plugin] HDF-Radio

Universal Multi-Patch v1.3 By RADIXX11

IWAN – Thanks and Praise ( Throw Back Thursday )

RONALD P SONDERGAARD Arrested by Miami-Dade County Corrections on Mar 03, 2017

मुख मैथुन से उठाएं सेक्स का भरपूर मज़ा, जानें क्या है इसका सही तरीकामुख मैथुन...

HSSC Excise & Taxation Inspector Result 2017 Scorecard/ Category Wise Merit List