I just tried to clean up an old German text containing the character 'ſ'
(U+017F). I wanted to replace it with 's'
, but when I used :%s/ſ/s/g
not only that character got replaced but also all occurrences of 's'
followed by an arbitrary character, as if I had used the command :%s/s./s/g
.
As an example, the text:
Die Gleichheit **) fordert das Nachdenken heraus durch Fragen, die ſich daran knüpfen und nicht ganz leicht zu beantworten ſind.
will be replaced by my command with:
Die Gleichheit **) fordert dasNachdenken herausdurch Fragen, die sich daran knüpfen und nicht ganz leicht zu beantworten sind.
I assume it might have something to do with the fact that 'ſ'
is represented in UTF-8 as a sequence of two bytes (0xC5 0xBF
). Isn't that a bug? If not, is there a way to just replace 'ſ'
and not also 's'
?
I am using fileencoding=utf-8
and:
> vim --versionVIM - Vi IMproved 9.1 (2024 Jan 02)Included patches: 1-151> echo $LANGde_DE.UTF-8
Here is a screenshot with :set hlsearch
:
UPDATE: I got hold of an installation of vim version 8.0 patches up to 586 on Windows 10, exhibiting the same behavior with both my and the \%u
versions of the command.