Quantcast
Channel: Active questions tagged utf-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 1135

StringEscapeUtils not handling utf-8

$
0
0

I have a string like this

String incoming = "<html> <head></head> <body>  <p><span style=\"font-family: Arial;\">Ευχαριστώ (eff-kha-ri-STOE) Tικανείς (tee-KAH-nis)? Mεσυγχωρείτε.</span></p> </body></html>";

and I'm escaping it using the StringEscapeUtils

import org.apache.commons.text.StringEscapeUtils;String escaped = StringEscapeUtils.escapeJava(incoming);

The result is

<html> <head></head> <body>  <p><span style=\"font-family: Arial;\">\u0395\u03C5\u03C7\u03B1\u03C1\u03B9\u03C3\u03C4\u03CE (eff-kha-ri-STOE) T\u03B9 \u03BA\u03B1\u03BD\u03B5\u03AF\u03C2 (tee-KAH-nis)? M\u03B5 \u03C3\u03C5\u03B3\u03C7\u03C9\u03C1\u03B5\u03AF\u03C4\u03B5.</span></p> </body></html>

I've tried converting it to utf-8 by getting the bytes and it doesn't work, is there any way I could get it fixed?

here's what I tried:

String s = new String(escaped.getBytes("UTF-8"), "UTF-8");

I've also tried a different library to escape the text still doesn't work.


Viewing all articles
Browse latest Browse all 1135

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>