Passing UTF-8 arguments to commands in Perl on Windows

I am trying to build a template for Perl scripts so that they would do at least most of the basic things right with UTF-8 and would work equally well on Linux and Windows machines.

One thing in particular escaped me for a while: the difficulty of passing UTF-8 strings as arguments to system commands. It seems to me that there is no way not to have arguments double UTF-8 encoded before they reach the shell (that is, I understand that there is a layer that ignores that the command and its arguments are already properly UTF-8 encoded, takes it for Latin-1 or something of the sorts, and encodes it again as UTF-8). I could not find a way to cleanly avoid this layer of encoding.

Take this script:

#!/usr/bin/perluse v5.14;use utf8;use feature 'unicode_strings';use feature 'fc';use open ':std', ':encoding(UTF-8)';use strict;use warnings;use warnings FATAL => 'utf8';use constant IS_WINDOWS => $^O eq 'MSWin32';# Set proper locale$ENV{'LC_ALL'} = 'C.UTF-8';# Set UTF-8 code page on Windowsif (IS_WINDOWS) {  system("chcp 65001 > nul 2>&1");};# Use Win32::Unicode::Process on Windowsif (IS_WINDOWS) {  eval {    require Win32::Unicode::Process;    Win32::Unicode::Process->import;  };  if ($@) {    die "Could not load Win32::Unicode::Process: $@";  };};# Show the empty directoryprint "---\n" . `ls -1 system*` . "---\n";my $utf = "test-тест-מבחן-परीक्षण-😊-𝓽𝓮𝓼𝓽";# Works fine on Linux but not on Windowsprint "System (touch) exit code: " . system("touch system-$utf > touch-system.txt 2>&1") . "\n";print "System (echo) exit code: " . system("echo system-$utf > echo-system.txt 2>&1") . "\n";if (IS_WINDOWS) {  # Works fine on Windows  print "SystemW (touch) exit code: " . systemW("touch systemW-$utf > touch-systemW.txt 2>&1") . "\n";  print "SystemW (echo) exit code: " . systemW("echo systemW-$utf > echo-systemW.txt 2>&1") . "\n";};# Show the directory with the new the filesprint "---\n" . `ls -1 system*` . "---\n";exit;

On Linux, everything is fine: the file created with touch through system() has a UTF-8 encoded filename and the content of the file created with echo is correctly UTF-8 encoded.

Yet, I found no way to get the same code to behave correctly on Windows. There, the output of the script is this:

------System (touch) exit code: 0System (echo) exit code: 0SystemW (touch) exit code: SystemW (echo) exit code: ---system-test-Ñ‚ÐµÑÑ‚-×ž×‘×—×Ÿ-à¤ªà¤°à¥€à¤•à¥à¤·à¤£-ðŸ˜Š-ð“½ð“®ð“¼ð“½systemW-test-тест-מבחן-परीक्षण-😊-𝓽𝓮𝓼𝓽---

As the script shows, the only way I could make it work is to use Win32::Unicode::Process::systemW() to replace system(). The file systemW-test-тест-מבחן-परीक्षण-😊-𝓽𝓮𝓼𝓽 is correctly named and the content of echo-systemW.txt is correctly encoded in UTF-8.

My questions are these:

Is there a way to avoid using systemW() and keep the code identical for Linux and Windows but somehow remove this layer that double-encodes the system command? In other words, is this the only good way to go?
If this is the right way, I am not sure how to obtain the similarly correct behaviour for backticks. They have the same problem as system() but I have no idea how to capture the output of a command with systemW() aside from piping it into a temporary file and reading that at the end (possible, of course, but maybe not great).

Passing UTF-8 arguments to commands in Perl on Windows

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Man charged with July slaying of Jovan Hopkins in Back of the Yards

JBL SUB-10 SUB-WOOFER - SCHEMATIC [CIRCUIT DIAGRAM] - AMP-SUBWOOFER

Download: I Cool ft Yo Maps x Cref Zee – Kaya (Prod By Eb2)

Top 14 Most Sexually Charged Excerpts From Erotica Books

Transporting objects between different PI systems

Lady Gaga – MAYHEM (Bonus Tracks Version) [iTunes Rip M4A]

Forum Post: RE: Convert "xxxxxb#b" to "xxxxx"

Mother of Hayle's Wesley McArthur calls for action after his methadone overdose

Chapter 3 Mindful Eating: A Path to a Healthy Body Extra Questions and...

Visual Studio のプロダクトキーの変更について

Eminem – 2013 – The Marshall Mathers LP 2 (2xLP 180g Vinyl) [Vinyl 24bit 96kHz]

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

TEAM R2R Network Block Runtime v1.0.0 READ NFO-R2R

BQ FirePro, LLC

Charlie Kirk

The 6 Best Sex Scenes in Nollywood Movies

Albert Anderson Arrested by Miami-Dade County Corrections on May 19, 2020

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Once Upon A Time In The D: The BMF History Book & Big Meech Timeline (1985-2020)