When debugging an operation with UTF8 strings, sometimes I want to see the string representation from a given ReadOnlySpan<byte>
so i created a static function to help me achieve it, but, one of the ways to do so doesn't worked as spected, i wonder why does the outcoming string is incomprehensible.
//#define FORCE_NOT_UTF8using MemoryMarshal = System.Runtime.InteropServices.MemoryMarshal;using Unsafe = System.Runtime.CompilerServices.Unsafe;using Encoding = System.Text.Encoding;static string ForgeString(ReadOnlySpan<byte> utf8Runes){ Span<char> buffer = utf8Runes.Length > 1024 ? new char[utf8Runes.Length] : stackalloc char[1024] ;#if FORCE_NOT_UTF8 Encoding.UTF8.GetChars(utf8Runes, buffer);#else if (Encoding.Default.BodyName != Encoding.UTF8.BodyName) { Encoding.UTF8.GetChars(utf8Runes, buffer); } else if(buffer.Length is <= 1024) { MemoryMarshal.Cast<byte, char>(utf8Runes).CopyTo(buffer); } else { ref readonly var elmnt0 = ref utf8Runes[0]; ref var ptrSrc = ref Unsafe.AsRef(in elmnt0); ref var ptrDst = ref buffer[0]; for(int i = 0; ptrSrc is not default(byte) && i < utf8Runes.Length; i++) { ptrDst = (char) ptrSrc; ptrSrc = ref Unsafe.Add(ref ptrSrc, 1); ptrDst = ref Unsafe.Add(ref ptrDst, 1); } }#endif Index end = buffer.IndexOf(default(char)) is int index and not -1 ? new(index) : Index.End; return new(buffer[..end]);}string result1 = default!;string result2 = default!;result1 = ForgeString("foobar"u8);result2 = ForgeString("james james james (...repeating 166 times)"u8);Console.WriteLine(result1);Console.WriteLine(result2);//in order to get string result3 its necessary to recompile with compiler symbol FORCE_NOT_UTF8
The for loop prints normally, 'James' a bunch of times but, using marshal casting, 'foobar' produces '潦扯牡.'What's happing behind Cast<TFrom,TTo>
to create this unexpected sequence? I thought the idea of it was literally (T)e
ing each element of a given span.