Saturday, September 5, 2009

MySQL's query cache

MySQL's query cache is somewhat unique and can be very beneficial to certain applications, such as web apps that run the same queries over and over. If you're unfamiliar with the query cache, here are some key points:

  • The query cache holds SELECT statements and their result sets. By "SELECT statements" I mean the entire query is considered 1 entry in the cache. Subselects or queries that are part of a UNION are not cached individually.

  • This cache is optional. In fact it is off by default in most MySQL binaries. Although query_cache_type might be set to "ON", if query_cache_size=0 then the cache is effectively off.

  • This is not the only cache MySQL has. The storage engines often have lower level caches (e.g.,InnoDB caches data and index pages in its cache configured with setting innodb_buffer_pool_size and MyISAM uses key_buffer_size for caching index pages)

  • When enabled, the query cache is before the parser. This means that in order for your query to be a cache "hit", it must match a cached query exactly (e.g., it is case-sensitive and even white space matters).

  • The cache will never serve stale data (unless you are running multiple mysqlds pointing to the same data directory - in that case do not use the query cache). So queries need to be invalidated when data changes. In order to make this fast and easy, it's done on a table-by-table basis; when a table changes, all queries that used that table are removed.

  • Since the query cache is before the parser, you might wonder how authorization is done. When the query was originally parsed and inserted in the cache, MySQL figured out which table(s) were used by the query (which is useful for invalidating queries, see above). By knowing the tables involved, a user's privileges can be checked. However, if a user has any column-level privileges on those tables, the query cache is not useful. I tend to avoid column-level privileges (since 5.0 you can use views instead).

  • Do not set the query cache too large (I'd say no more than 256MB) because it becomes less efficient.

  • Monitor its effectiveness: Use SHOW GLOBAL STATUS to look at qcache_hits vs. com_select (query cache misses). Also, if you have Qcache_lowmen_prunes, you may want to increase the query_cache_size (though heed the earlier advice about not setting it too large). Lots of Qcache_inserts may be an indicator that queries are being invalidated and re-cached frequently.

  • MySQL Cluster can use the query cache, however, each MySQL server has its own query cache. Queries will be invalidated if data in the cluster changes. To incur less overhead, you can configure ndb_cache_check_time which tell NDB how often to check the cache for invalidation.