Quantcast
Channel: Active questions tagged utf-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 1135

how does python determine encodings under different circumstances? [duplicate]

$
0
0

I have a python script that runs on a windows machine and looks like this:

print(f"Current encoding: {sys.stdout.encoding}")print(some_randomized_chars)

and I run it in different ways using pycharm.My confusion is about the way it behaves when it comes to encodings.When I run it "normally", i.e., Pycharm's "Run" or going to pycharm's terminal (powershell) and running

python producer.py

I get

Current encoding: utf-8

And everything get printed to the terminal, including some "gibberish" chars on occasion, which is fine. However, when trying to pipe it

python producer.py > test.txt

it creates a file whose 1st line is

Current encoding: cp1252

but sometimes I get this error thrown in the terminal after some successful writes:

  File "AppData\Local\Programs\Python\Python311\Lib\encodings\cp1252.py", line 19, in encode    return codecs.charmap_encode(input,self.errors,encoding_table)[0]           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^UnicodeEncodeError: 'charmap' codec can't encode character '\x9e' in position 6: character maps to <undefined>

Lastly, I would expect the encoding of the created test.txt to be cp1252 (ANSI latin), but when I open it in a text editor, it shows it's utf-16 LE.

So my confusion here sums up to these questions:-Why does python print "utf-8" as the current encoding when running normally?-Why does it then switch to "cp1252" when running it with piping, and why does it sometimes crash?-If the encoding in the latter case is ANSI, why does it show utf-16 LE on the file itself?

Thank you


Viewing all articles
Browse latest Browse all 1135

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>