When run over SSH, R appears to treat non-ASCII characters differently depending on the OS of the SSH client.
For example, if I use a computer running macOS (14.6.1) to start an R session on an Ubuntu machine (22.04.5), and run:
units::set_units(12.7, "\U00B5m")
I get:
12.7 [µm]
But the same expression run on the same server, but by a Windows client (10.0.19045.4170) produces:
Error: In '<U+00B5>m', '<U+00B5>m' is not recognized by udunits.
I thought that this could have to do with how the command line on each OS sends the character representations over SSH. However, if I save the following script on the server (written using vim over SSH from the macOS machine):
#!/bin/Rscriptprint(nchar("µm"))
And execute it over SSH from the macOS client (e.g., ssh <user>@<host> "./print_micron.R"
), I get:
[1] 2
i.e., "µ"
is a single two-byte character. But if I execute it from the Windows client, I get:
[1] 3
i.e., "µ"
becomes two separate characters, one for each byte.
This is challenging my intuition of how executing commands on SSH works, as I would expect the behavior of R to be determined entirely by the server. Why would the client OS affect how non-ASCII characters are represented by R?