Quantcast
Channel: Active questions tagged utf-8 - Stack Overflow
Viewing all articles
Browse latest Browse all 1056

Fix UTF-8 Character Display in PHP *After* Upgrade from MySQL 5.7 to MySQL 8

$
0
0

We upgraded our RDS database on AWS from MySQL 5.7 to MySQL 8.

The server Character Set and Server Connection are set to UTF-8 Unicode. That's utf8mb4 and utf8mb4_unicode_ci, respectively.

But the actual database and the table/columns have the collation of latin1_swedish_ci.

The header on the PHP pages is set to utf8 and everything displayed correctly. Despite the mismatch, everything worked great for many years. And despite what you're probably thinking, I'd prefer to keep things working just like they have been.... just like old plumbing, better not to touch it and set off a cascade of new problems.

Anyway, special characters and Emojis are stored in the database as latin1 representations such as ’ (fancy apostrophe) and 😎 (sunglasses emoji).

PHP still successfully renders the latin1 characters when output as UTF8, UNLESS the data is pulled from the database. Content from the database shows up as the raw latin1 characters like the †or 😎 as noted above. Attempts to parse the latin1 data and render as utf8 in PHP with mb string, iconv, or set the character set with a MySQLi command did not work.

My questions:

  1. Is there a setting on phpMyAdmin that I can adjust so everything worked like it did before? The issue seems to stem from the way the data is handed off from MySQL to PHP.

  2. If the answer to my first question is 'no', what do you think about this strategy:

First, change the collation for the database and database columns to utf8mb4_0900_ai_ci in phpMyAdmin. That collation is suggested here: https://dba.stackexchange.com/questions/76788/create-a-mysql-database-with-charset-utf-8

Then, run the following query to convert HTML entities and emojis from latin1 to UTF-8:

UPDATE posts SET post_content = CONVERT(CAST(CONVERT(post_content USING latin1) AS BINARY) USING utf8mb4);

When I run the MySQL query above on test data, it seems to work well. I'm concerned about updating millions of rows but there's a ray of hope with this approach.

I am seeking a solution that fixes the old data (emojis and punctuation) and will continue to work in the future. Most solutions on SO only discuss ALTER commands to update the tables, versus correcting the existing data and then updating the database structure.


Viewing all articles
Browse latest Browse all 1056

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>