Quantcast
Channel: Active questions tagged utf-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 1060

Parsing with UTF-8 include non ASCII characters

$
0
0

I have code that parse UTF-8 in cpp, that the output is wchar_t. It pasre also non ASCII characters. I need to clear my code from w_char and replace it with char, so I need to chage the output. The problem is that now it not pasre good the non ASCII characters and it gibberish.

I use this macro:

#define Decode(bFirst,bSecond,bResult,fNotValid) \  { if((bFirst >= '0') && (bFirst <= '9')) \    bResult = 16 * (bFirst - '0'); \  else if((bFirst >= 'A') && (bFirst <= 'F')) \    bResult = 16 * (bFirst - 55); \else if((bFirst >= 'a') && (bFirst <= 'f')) \    bResult = 16 * (bFirst - 87); \else \NotValid = TRUE; }\if(!NotValid) {\    if((bSecond >= '0') && (bSecond <= '9')) \        bResult += (bSecond - '0'); \    else if((bSecond >= 'A') && (bSecond <= 'F')) \        bResult += (bSecond - 55); \    else if((bSecond >= 'a') && (bSecond <= 'f')) \        bResult += (bSecond - 87); \    else \    NotValid = TRUE; }\

The part of the code that pasre is:

char* buff; //this want I want to parse, like example %D7%A4bool NotValid=false;unsigned long index = 1;unsigned char byte1;unsigned char byte2;Decode(pBuff[index], pBuff[index + 1], byte1, NotValid);index +=3;Decode(pBuff[index], pBuff[index + 1], byte2, NotValid);WORD wInputWideByte;wInputWideByte ^= wInputWideByte;(reinterpret_cast<unsigned char*>(&wInputWideByte))[1] = (byte1 & 0x1C) >> 2;(reinterpret_cast<unsigned char*>(&wInputWideByte))[0] |= (byte2 & 0x3F);(reinterpret_cast<unsigned char*>(&wInputWideByte))[0] |= (byte2 & 0x3F);wchar_t* output;unsigned long outputIndex = 0;output[outputIndex] = wInputWideByte;

Viewing all articles
Browse latest Browse all 1060

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>