Friday, August 14, 2009

Common indexing mistakes

Here's a quick list of common mistakes that I've seen regarding indexing in MySQL.

1. Not using an index. Enable your slow query log (and consider setting long_query_time and log_queries_not_using_indexes too) and watch for queries that aren't using an index.

2. Using CREATE INDEX. While CREATE INDEX is not inherently bad, it only allows you to do one thing: add a single index to a table. It is mapped to an ALTER TABLE, so you might as well just use ALTER TABLE and then you have the benefit of being able to do multiple things in the same statement (e.g., add an index and remove an index or add 2 indexes).

3. Misusing a composite index. Take this example:
CREATE TABLE customer (
custid INT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
fname VARCHAR(30),
lname VARCHAR(30),
INDEX (lname, fname)
);

The composite index on (lname, fname) can be used for searches that use lname and fname OR for searches only on lname. For example these queries can use that index:
SELECT * FROM customer WHERE fname = 'Sarah' AND lname = 'Sproehnle';
SELECT * FROM customer WHERE lname = 'Sproehnle'; -- uses the first part of the index
SELECT * FROM customer WHERE lname LIKE 'Spr%'; -- uses the first part of the index
SELECT fname FROM customer WHERE lname = 'Sproehnle'; -- uses it as a covering index
SELECT * FROM customer WHERE lname = 'Sproehnle' ORDER BY fname; -- uses the index and avoids filesort

However, the following query will not use the index:
SELECT * FROM customer WHERE fname = 'Sarah'

Another mistake is to add an additional index on lname. In most cases you wouldn't want that since the composite index can already be used for looks on lname.

4. Using an expression on a column. I often see developers write queries that use a function or other expression in their WHERE clause. If the expression is directly on the column, an index can't be used. For example, let's say I want customers who joined this year, I could do this:
SELECT * FROM customer WHERE year(customer_join_date) = 2009; -- Cannot use an index on customer_join_date

If I rewrote the query I could take advantage of an index on customer_join_date:
SELECT * FROM customer WHERE customer_join_date BETWEEN '2009-01-01' AND '2009-12-31';

MySQL does not have function-based indexes. There are ways to mimic these, but I'll leave that for another blog post.

5. Appending the primary key to an index on an InnoDB table. This applies to InnoDB, not MyISAM. InnoDB uses the primary key as a row pointer in your secondary indexes. So effectively an index on column lname is actually implemented like an index on (lname, custid)

4 comments:

  1. "MySQL does not have function-based indexes."

    I think worse than this, MySQL does not have that many functions which are optimizer and partition pruning aware.

    In other databases you should be able to say YEAR(customer_join_date) and the optimizer rewrites that to BETWEEN '2009-01-01' AND '2009-12-31'.

    Great to see you blogging ;) I'd never thought about the CREATE INDEX statement like that, but I will follow your advice.

    ReplyDelete
  2. Hi!

    nice post, what Morgan said: great to see you blogging. You're on my blog links :)

    Comment mode now:

    "The composite index on (lname, fname) can be used for searches that use lname and fname OR for searches only on lname."

    Not necessarily. If the table happened to be backed by the MEMORY engine, the index would have been a HASH index (unless explicitly specified as BTREE index, HASH is default for the MEMORY engine). HASH indexes cannot be utilized unless values are available for all of the indexed columns: only then can a hash value be computed to do the actual look up.

    Your statement does hold for all flavours of ordered indexes (BTREE, T-TREE)

    ReplyDelete
  3. Roland, Good point. I should have clarified that I was talking about "tree" indexes (b-tree, t-tree, binary tree, etc). Hash indexes (available in the Cluster and Memory engines) have their own intricacies that I may deal with in another blog post. Thanks!

    ReplyDelete
  4. Can you please anyone help me for resolving function based index using Mysql

    ReplyDelete