I've encountered a problem using MySQL on Docker. When I directly insert non-ASCII characters in the database using the initialization sql script, the characters are correctly shown on MySQL's console, but their encodings are wrong.
I coded a MySQL container with a minimal sql script to reproduce the problem.
Here's the structure of my directory:
.├── docker-compose.yml└── my-sql├── Dockerfile└── ddl└── mySQL.sql
docker-compose.yml
version: '3.8'services: mysql: build: context: ./my-sql/ dockerfile: Dockerfile container_name: mysql expose: - 3306 ports: - 3306:3306 environment: MYSQL_ROOT_PASSWORD: test MYSQL_USER: test MYSQL_PASSWORD: test MYSQL_DATABASE: test volumes: - "mysql:/var/lib/mysql"volumes: mysql:
Dockerfile
FROM mysql:8.2.0USER 999:999COPY ./ddl /docker-entrypoint-initdb.d/
mySQL.sql
CREATE TABLE IF NOT EXISTS `test`( `test` VARCHAR(100)) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;INSERT INTO `test` (`test`) VALUES ("🤮");INSERT INTO `test` (`test`) VALUES ("平");ALTER DATABASE test CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; # Does not work
When I use MySQL using the console and that I select the test column of the test table, I get this:
mysql> SELECT `test` FROM `test`;+------+| test |+------+| 🤮 || 平 |+------+
mysql> SELECT HEX(`test`) FROM `test`;+------------------+| HEX(`test`) |+------------------+| C3B0C5B8C2A4C2AE || C3A5C2B9C2B3 |+------------------+
I did some research to find the correct encoding of these characters in various encodings and I didn't see these encodings, and as far as my knowledges go, maximum size for UTF-8 character is 4 bytes. I also noticed that the "|" alignment of what MySQL prints is wrong (and proportionally wronger to the size of the character hexadecimal encoding).
I looked at the hexadecimal encoding of the .sql script using VScode and, at least, the emoji is correctly encoded (F0 9F A4 AE).
I also tried another MySQL version (8.0.36), but it still doesn't work.
Thanks in advance