Open main menu

Changes

→‎Changing Character Sets and Collations: update config example; remove duplicate
===Changing Character Sets and Collations===
Character sets and collations can be set from the server level right down to the column level, as well as for client-server communication. For example, the <code>[https://mariadb.com/kb/en/server-system-variables/#collation_connection collation_connection]</code> '''server system variable'''. There is another KB page for [https://mariadb.com/kb/en/setting-character-sets-and-collations/ Setting Character Sets and Collations]. Keep in mind that there a many possible collations per character set, but only one default per character set. So, if you specify a character set, but are silent on the collation, you will receive the default. Character sets and collations always cascade down, so a column without a specified collation will look for the table default, the table for the database, and the database for the server. It's therefore possible to have extremely fine-grained control over all the character sets and collations used in your data. Collation names always start with the character set they correspond to (, e.g. '''latin1 ''' charset and latin1_bin '''latin1'''_bin collation(the bin suffix stands for 'binary').
 
If your collation and character sets are all <code>latin1_swedish</code>, but you want them to be <code>UTF-8</code>, then you can set it in the configuration file, and restart the database.
<source lang="mysql">
show GLOBAL VARIABLES LIKE 'character_set_%';
show VARIABLES LIKE 'collation%';
</source><pre># Variable_name, Value'collation_connection', 'utf8_general_ci''collation_database', 'latin1_swedish_ci''collation_server', 'latin1_swedish_ci'</pre> This is the mess that I wound up with somehow:
<source lang="mysql">
SET @DATABASE_NAME = 'wiki_freephile';
</source>
To set the mysql server default character set to utf8 binary <ref>https://mariadb.com/kb/en/library/setting-character-sets-and-collations/ https://scottlinux.com/2017/03/04/mysql-mariadb-set-character-set-and-collation-to-utf8/</ref>, add the following to <code>/etc/mysql/my.cnf</code> (Debian/Ubuntu) or <code>/etc/my.cnf</code> (Fedora/Centos/RHEL) under the '''<nowiki>[mysqld]</nowiki>''' section:
<pre>
# character set and collationcharacter-set-server = utf8binarycollation-server = utf8_general_ciBINARY# ignore client side character set value sent during handshakeskip-character-set-client-handshake= 1
</pre>
Then restart mysql/mariadb:
== Understand Encoding and Collation ==
http://kunststube.net/encoding/
== 2016 problems with MediaWiki ==
Trying to Special:Import an XML dump from Special:Export of Commons, I get an "illegal mix of collations" error.
{| class="wikitable"
| Schema: || mediawiki
|-
| Default collation: || latin1_swedish_ci
|-
| Default characterset: || latin1
|}
I also have <code>$wgDBmysql5 = true;</code> in LocalSettings.php <ref>https://www.mediawiki.org/wiki/Manual:$wgDBmysql5</ref>
== Default MediaWiki SQL schema setup ==
When MediaWiki is first installed, [https://phabricator.wikimedia.org/source/mediawiki/browse/master/maintenance/tables.sql this is the SQL that is run].
</source>
== DB engine Conversion ==
The table type has nothing to do with Collation, but while updating my database, I wanted to switch the engine from MyISAM to InnoDB which is now the default.
So then to change the charset/collation:
 * UTF8 is a very bad choice for that. * If you ever need to put the DB data into something Unicode, then be sure to use the utf8mb4 (Multi-byte 4) variety.
The *recommended* charset/collation is 'binary BINARY'.
</source>
== Table or column collation ==
You can go further than setting the database, or table defaults; you can look at collation and character sets for individual columns. Individual columns probably don't need to be (and shouldn't be) converted to a new character set or collation unless you are certain you know what you're doing. MySQL has this [https://dev.mysql.com/doc/refman/5.5/en/charset-conversion.html reference on charset conversion]
{| class="wikitable"
|-
| TABLE_SCHEMA || count || COLLATION_NAME
|-
| wiki_demo || 0 || NULL
|-
| wiki_demo || 36 || latin1_bin
|-
| wiki_demo || 37 || latin1_swedish_ci
|-
| wiki_demo || 3 || utf8_general_ci
|}
[[Category:QualityBox]]
[[Category:Wiki]]
<references />