Character sets
Ideally a proper character set will be set at the server level, and doing this is described within the » Character Set Configuration section of the MySQL Server manual. Alternatively, each MySQL API offers a method to set the character set at runtime.
The character set and character escaping
The character set should be understood and defined, as it has an affect on every action, and includes security implications. For example, the escaping mechanism (e.g., mysqli_real_escape_string() for mysqli, mysql_real_escape_string() for mysql, and PDO::quote() for PDO_MySQL) will adhere to this setting. It is important to realize that these functions will not use the character set that is defined with a query, so for example the following will not have an effect on them:
Пример #1 Problems with setting the character set with SQL
<?php
$mysqli = new mysqli("localhost", "my_user", "my_password", "world");
// Will not affect $mysqli->real_escape_string();
$mysqli->query("SET NAMES utf8");
// Will not affect $mysqli->real_escape_string();
$mysqli->query("SET CHARACTER SET utf8");
// But, this will affect $mysqli->real_escape_string();
$mysqli->set_charset('utf8');
?>
Below are examples that demonstrate how to properly alter the character set at runtime using each each API.
Пример #2 Setting the character set example: mysqli
<?php
$mysqli = new mysqli("localhost", "my_user", "my_password", "world");
if (!$mysqli->set_charset('utf8')) {
printf("Error loading character set utf8: %s\n", $mysqli->error);
} else {
printf("Current character set: %s\n", $mysqli->character_set_name());
}
print_r( $mysqli->get_charset() );
?>
Пример #3 Setting the character set example: pdo_mysql
Note: This only works as of PHP 5.3.6.
<?php
$pdo = new PDO("mysql:host=localhost;dbname=world;charset=utf8", 'my_user', 'my_pass');
?>
Пример #4 Setting the character set example: mysql
<?php
$conn = mysql_connect("localhost", "my_user", "my_pass");
$db = mysql_select_db("world");
if (!mysql_set_charset('utf8', $conn)) {
echo "Error: Unable to set the character set.\n";
exit;
}
echo 'Your current character set is: ' . mysql_client_encoding($conn);
?>
Коментарии
Please note that MySQL's utf8 encoding has a maximum of 3 bytes and is unable to encode *all* unicode characters.
If you need to encode characters beyond the BMP (Basic Multilingual Plane), like emoji or other special characters, you will need to use a different encoding like utf8mb4 or any other encoding supporting the higher planes. Mysql will discard any characters encoded in 4 bytes (or more).
See https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-utf8mb4.html for more information on the matter
After setting the charset, you should define the 'collation' too, to give information on how sorting results on requests. By default, it is 'utf8mb4_general_ci', which is a simplified set of sorting rules. For the official rules, edicted by Unicode, it should be 'utf8mb4_unicode_ci'.
For example:
\mysqli_set_charset($hdl, 'utf8mb4');
\mysqli_query($hdl, 'SET collation_connection = utf8mb4_unicode_520_ci');