MySQL collation

Rumman Ansari   Software Engineer   2024-08-18 08:36:28   107  Share
Subject Syllabus DetailsSubject Details
☰ TContent
☰Fullscreen

In MySQL, collation refers to a set of rules that determine how strings are compared and sorted in the database. It defines how characters are treated when performing operations like ORDER BY, GROUP BY, or when comparing two strings.

Key Concepts:

  1. Character Set:

    • A character set is a set of symbols and encodings. For example, UTF-8 is a popular character set that includes most characters from all the languages in the world.
  2. Collation:

    • Collation is a set of rules for how to compare characters in a character set. Each character set can have one or more collations. The collation determines things like whether 'A' is considered equal to 'a', or how to sort characters in different languages.

Example:

  • utf8_general_ci: This is a collation for the UTF-8 character set. The "general" part means it’s a general-purpose collation, and "ci" stands for case-insensitive, meaning 'A' is considered equal to 'a'.
  • utf8_bin: This is another collation for UTF-8, but it’s case-sensitive and compares strings based on the binary value of each character.

Why Collation Matters:

  1. Sorting and Searching:

    • The collation determines how strings are sorted and compared. For example, in a case-insensitive collation, "apple" and "Apple" would be considered equal, while in a case-sensitive collation, they would not.
  2. Multi-Language Support:

    • Different languages have different rules for sorting and comparing characters. By selecting an appropriate collation, you can ensure that sorting and searching in your database respect the linguistic rules of the language you’re working with.
  3. Indexing:

    • Collation can affect the way indexes are built and how efficiently they work, especially in case-insensitive vs. case-sensitive contexts.

Setting Collation:

At the Database Level:

<span class="pln">
CREATE DATABASE mydb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci</span><span class="pun">;</span><span class="pln">
</span>

At the Table Level:

<span class="pln">
CREATE TABLE mytable </span><span class="pun">(</span><span class="pln">
    col1 VARCHAR</span><span class="pun">(</span><span class="lit">100</span><span class="pun">)</span><span class="pln">
</span><span class="pun">)</span><span class="pln"> CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci</span><span class="pun">;</span><span class="pln">
</span>

At the Column Level:

<span class="pln">
CREATE TABLE mytable </span><span class="pun">(</span><span class="pln">
    col1 VARCHAR</span><span class="pun">(</span><span class="lit">100</span><span class="pun">)</span><span class="pln"> CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci
</span><span class="pun">);</span><span class="pln">
</span>

Common Collations:

  • utf8_general_ci: General-purpose, case-insensitive collation for UTF-8.
  • utf8_bin: Binary collation for UTF-8, case-sensitive.
  • utf8_unicode_ci: Case-insensitive collation that sorts according to the Unicode standard, supporting multiple languages more accurately.

Changing Collation:

You can change the collation of an existing table or column:

  • Changing Table Collation:

  • <span class="pln">
    ALTER TABLE mytable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci</span><span class="pun">;</span><span class="pln">
    </span>
  • Changing Column Collation:
  • <span class="pln">
    ALTER TABLE mytable MODIFY col1 VARCHAR</span><span class="pun">(</span><span class="lit">100</span><span class="pun">)</span><span class="pln"> CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci</span><span class="pun">;</span><span class="pln">
    </span>

Conclusion:

Collation is essential for ensuring that string comparisons, sorting, and searching behave as expected in your database, especially when dealing with different languages and character sets.


No Questions Data Available.
No Program Data.

Stay Ahead of the Curve! Check out these trending topics and sharpen your skills.