B-Tree Index in MySql
Posted by Vikas Gupta on December 17, 2012
Indexes in MySql works like an index in a book. While, indexes in a book tell you about the pages on which a term occurs, indexes in MySql tell you the rows that contain the matching data. An index contains values from one or more columns in a table. If you index more than one column, the column order is very important, because MySQL can only search efficiently on a leftmost prefix of the index. Creating an index on two columns is not the same as creating two separate single-column indexes. In this post, I will discuss about B-Tree index and it’s working in MySql.
This is the default index for most storage engines in MySql. The general idea of a B-Tree is that all the values are stored in order, and each leaf page is the same distance from the root.
A B-Tree index speeds up data access because the storage engine doesn’t have to scan the whole table to find the desired data. Instead, it starts at the root node. The slots in the root node hold pointers to child nodes, and the storage engine follows these pointers. It finds the right pointer by looking at the values in the node pages, which define the upper and lower bounds of the values in the child nodes. Eventually, the storage engine either determines that the desired value doesn’t exist or successfully reaches a leaf page. Leaf pages are special, because they have pointers to the indexed data instead of pointers to other pages.
Let’s take an example and understand the queries which can benefit from an B-Tree Index. Suppose you have the following table:
CREATE TABLE People (
last_name varchar(50) not null,
first_name varchar(50) not null,
dob date not null,
gender enum(‘m’, ‘f’)not null,
key(last_name, first_name, dob)
For the above table, the index will contain the values from the last_name, first_name, and dob columns for every row in the table. The index will sort the values according to the order of the columns given above. So, if there are people with same name but different birth dates, they will be sorted by birth date.
Now, that we have an understanding of how B-tree works, let’s discuss the types of queries which will benefit from B-tree indexes.
- Match the full value: A match on the full key value specifies values for all columns in the index. For example, this index can help you find a person named Vikas Gupta who was born on 13/11/1979.
- Match a leftmost prefix: This index can help you find all people with the last name. This uses only the first column in the index.
- Match a column prefix: Can help you find all people whose last name begin with A. This uses only the first column in the index.
- Match a range of values: Can find everyone whose last name is Gupta and Garg.
- Match one part exactly and match a range on another part: Find someone who last name is Gupta and first name starts with K.
However, there are some limitations of B-Tree indexes. A B-Tree won’t help, if:
- Lookup does not start from the leftmost side of the indexed columns. For example, the above index won’t help if you find people name Vikas or people born on a certain date, because those column are not leftmost in the index. Likewise, the above index cannot be used to find people whose last name end with something.
- any of the column defined in the index is skipped. That is, you cannot used last name and date of birth to find someone, as in this case, first name is missed.
- any condition is used after a range condition. For example, in a query with where condition is lastname = ‘gupta’ and firstname like ‘V%’ and dob = ’13-11-1979′, the index access will use only the first two columns in the index, because LIKE is a range condition.
In this post, I discussed about working on B-Tree in detail. In the coming posts, I will explain other indexes supported by MySql.