Quantcast
Channel: Active questions tagged utf-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 1165

Why does R treat non-ASCII characters differently depending on the SSH client's OS?

$
0
0

When run over SSH, R appears to treat non-ASCII characters differently depending on the OS of the SSH client.

For example, if I use a computer running macOS (14.6.1) to start an R session on an Ubuntu machine (22.04.5), and run:

units::set_units(12.7, "\U00B5m")

I get:

12.7 [µm]

But the same expression run on the same server, but by a Windows client (10.0.19045.4170) produces:

Error: In '<U+00B5>m', '<U+00B5>m' is not recognized by udunits.

I thought that this could have to do with how the command line on each OS sends the character representations over SSH. However, if I save the following script on the server (written using vim over SSH from the macOS machine):

#!/bin/Rscriptprint(nchar("µm"))

And execute it over SSH from the macOS client (e.g., ssh <user>@<host> "./print_micron.R"), I get:

[1] 2

i.e., "µ" is a single two-byte character. But if I execute it from the Windows client, I get:

[1] 3

i.e., "µ" becomes two separate characters, one for each byte.

This is challenging my intuition of how executing commands on SSH works, as I would expect the behavior of R to be determined entirely by the server. Why would the client OS affect how non-ASCII characters are represented by R?


Viewing all articles
Browse latest Browse all 1165

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>