Home > Articles

Query Optimization

This chapter is from the book

This chapter is from the book

Optimization for Administrators

The previous sections describe optimizations that can be performed by unprivileged MySQL users. Administrators who have control of the MySQL server or the machine on which it runs can perform additional optimizations. For example, some server parameters pertain to query processing and may be tuned, and certain hardware configuration factors have a direct effect on query processing speed. In many cases, these optimizations improve the performance of the server as a whole, and thus have a beneficial effect for all MySQL users.

In general, you should keep the following principles in mind when performing administrative optimizations:

  • Accessing data in memory is faster than accessing data from disk.

  • Keeping data in memory as long as possible reduces disk activity.

  • Retaining information from an index is more important than retaining contents of data records.

Specific ways you can apply these principles are discussed next.

Increase the size of the server's caches. The server has many parameters (system variables) that you can change to affect its operation. Several of these directly affect the speed of query processing. The most important parameters you can change are the sizes of the table cache and the caches used by the storage engines to buffer information for indexing operations. If you have memory available, allocating it to the server's caches allows information to be held in memory longer and reduces disk activity. This is good, because it's much faster to access information from memory than to read it from disk.

  • When the server opens table files, it tries to keep them open so as to minimize the number of file-opening operations. To do this, it maintains information about open files in the table cache. The table_cache system variable controls the size of this cache. If the server accesses lots of tables, the table cache fills up and the server closes tables that haven't been used for a while to make room for opening new tables. You can assess how effective the table cache is by checking the Opened_tables status indicator:

  • SHOW STATUS LIKE 'Opened_tables';

    Opened_tables indicates the number of times a table had to be opened because it wasn't already open. (This value is also displayed as the Opens value in the output of the mysqladmin status command.) If the number remains stable or increases slowly, it's probably set to about the right value. If the number grows quickly, it means the cache is too small and that tables often have to be closed to make room to open other tables. If you have file descriptors available, increasing the table cache size will reduce the number of table opening operations.

  • The key buffer is used by the MyISAM storage engine to hold index blocks for index-related operations. Its size is controlled by the key_buffer_size system variable. Larger values allow MySQL to hold more index blocks in memory at once, which increases the likelihood of finding key values in memory without having to read a new block from disk. The default size of the key buffer is 8MB. If you have lots of memory, that's a very conservative value and you should be able to increase it substantially and see a considerable improvement in performance for index-based retrievals and for index creation and modification operations.

  • In MySQL 4.1 and up, you can create additional key caches for MyISAM tables and assign specific tables to them. This can help query processing for those tables, as explained in "Using MyISAM Key Caches."

  • The InnoDB and BDB engines have their own caches for buffering data and index values. The sizes are controlled by the innodb_buffer_pool_size and bdb_cache_size variables. The InnoDB engines also maintains a log buffer. The innodb_log_buffer_size variable controls its size.

  • Another special cache is the query cache, described later in "Using the Query Cache."

Instructions for setting system variables may be found in Chapter 11, "General MySQL Administration." When you change parameter values, follow these guidelines:

  • Change one parameter at a time. Otherwise, you're varying multiple independent variables and it becomes more difficult to assess the effect of each change.

  • Increase system variable values incrementally. If you increase a variable by a huge amount on the theory that more is always better, you may run your system out of resources, causing it to thrash or slow to a crawl because you've set the value too high.

  • Rather than experimenting with parameter tuning on your production MySQL server, it might be prudent to set up a separate test server.

  • To get an idea of the kinds of parameter variables that are likely to be appropriate for your system, take a look at the my-small.cnf, my-medium.cnf, my-large.cnf, and my-huge.cnf option files included with MySQL distributions. (On Unix, you can find them under the support-files directory in source distributions and under the share directory in binary distributions. On Windows, they are located in the base installation directory, and the filename suffix might be .ini.) These files will give you some idea of which parameters are best to change for servers that receive different levels of use, and also some representative values to use for those parameters.

Other strategies you can adopt to help the server operate more efficiently include the following:

Disable storage engines that you don't need. The server won't allocate any memory for disabled engines, allowing you to devote it to other uses. Most storage engines can be excluded from the server binary at configuration time if you build MySQL from source. For those engines that are included in the server, many can be disabled at runtime with the appropriate startup options. See "Selecting Storage Engines," in Chapter 11, "General MySQL Administration" for details.

Keep grant table permissions simple. Although the server caches grant table contents in memory, if you have any rows in the tables_priv or columns_priv tables, the server must check table- and column-level privileges for every query. If those tables are empty, the server can optimize its privilege checking to skip those levels.

If you build MySQL from source, configure it to use static libraries rather than shared libraries. Dynamic binaries that use shared libraries save on disk space, but static binaries are faster. However, some systems require dynamic linking if you use the user-defined function (UDF) mechanism. On such systems, static binaries will not work.

Using MyISAM Key Caches

When MySQL executes a statement that uses indexes from MyISAM tables, it uses a key cache to hold index values. The cache allows disk I/O to be reduced: If key values needed from a table are found in the cache, they need not be read from disk again. Unfortunately, the key cache is a finite resource and it is shared among all MyISAM tables by default. If key values are not found in the cache and the cache is full, contention results: Some values currently in the cache must be discarded to make room for new values. The next time the discarded values are needed, they must be read from disk again.

If you have an especially heavily used MyISAM table, it would be nice to ensure that its keys remain in memory, but contention in the cache works against this. Contention can arise either when keys need to be read from the same table, or from other tables. You might avoid same-table contention by making the key cache large enough to hold all of a given table's indexes completely, but keys from other tables still could contend for space in the cache.

MySQL 4.1 and up offers a solution to this problem because it supports setting up multiple key caches and allows a table's indexes to be assigned to and preloaded into a given cache. This can be useful if you have a table that sees especially heavy use and you have sufficient memory to load its indexes into the cache. This capability enables you to avoid both same-table and other-table contention: Create a cache that is large enough to hold a table's indexes completely and devote the cache exclusively to the use of that table. No disk I/O is necessary after the keys have been loaded into the cache. Also, key values will never need to be discarded from the cache and key lookups for the table can be done in memory.

The following example shows how to set up a key cache for the member table in the sampdb database, using a cache with a name of member_cache and a size of 1MB. You must have the SUPER privilege to carry out these instructions.

  1. Set up a separate key cache large enough to hold the indexes from the table:

  2. mysql> SET GLOBAL member_cache.key_buffer_size = 1024*1024;
  3. Assign the table to the key cache:

  4. mysql> CACHE INDEX member IN member_cache;
    +---------------+--------------------+----------+----------+
    | Table         | Op                 | Msg_type | Msg_text |
    +---------------+--------------------+----------+----------+
    | sampdb.member | assign_to_keycache | status   | OK       |
    +---------------+--------------------+----------+----------+
  5. Preload the table's indexes into its key cache:

  6. mysql> LOAD INDEX INTO CACHE member;
    +---------------+--------------+----------+----------+
    | Table         | Op           | Msg_type | Msg_text |
    +---------------+--------------+----------+----------+
    | sampdb.member | preload_keys | status   | OK       |
    +---------------+--------------+----------+----------+

If you want to load other tables into the same cache or create other key caches for other tables, that can be done as well. For more information about key caches, consult Chapter 11.

Using the Query Cache

The MySQL server can use a query cache to speed up processing of SELECT statements that are executed repeatedly. The resulting performance improvement often is dramatic. The query cache works as follows:

  • The first time a given SELECT statement is executed, the server remembers the text of the query and the results that it returns.

  • The next time the server sees that statement, it doesn't bother to execute it again. Instead, the server pulls the result directly from the query cache and returns it to the client.

  • Query caching is based on the literal text of query strings as they are received by the server. Queries are considered the same if the text of the queries is exactly the same. Queries are considered different if they differ in lettercase or come from clients that are using different character sets or communication protocols. They also are considered different if they are otherwise identical but do not actually refer to the same tables (for example, if they refer to identically named tables in different databases).

  • When a table is updated, any cached queries that refer to it become invalid and are discarded. This prevents the server from returning out-of-date results.

Support for the query cache is built in by default. If you don't want to use the cache, and want to avoid incurring even the minimal overhead that it involves, you can build the server without it by running the configure script with the --without-query-cache option.

To determine whether a server supports the query cache, check the value of the have_query_cache system variable:

mysql> SHOW VARIABLES LIKE 'have_query_cache';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| have_query_cache | YES   |
+------------------+-------+

For servers that do include query cache support, cache operation is based on the values of three system variables:

  • query_cache_type determines the operating mode of the query cache. The following table shows the possible mode values:

  • Mode

    Meaning

    0

    Don't cache query results or retrieve cached results.

    1

    Cache queries except those that begin with SELECT SQL_NO_CACHE.

    2

    Cache on demand only those queries that begin with SELECT SQL_CACHE.


  • query_cache_size determines the amount of memory to allocate for the cache, in bytes.

  • query_cache_limit sets the maximum result set size that will be cached; query results larger than this value are never cached.

For example, to enable the query cache and allocate 16MB of memory for it, use the following settings in an option file:

[mysqld]
query_cache_type=1
query_cache_size=16M

The amount of memory indicated by query_cache_size is allocated even if query_cache_type is zero. To avoid wasting memory, don't set the size greater than zero unless you plan to enable the cache. Also, a size of zero effectively disables the cache even if query_cache_type is non-zero.

Individual clients begin with query caching behavior in the state indicated by the server's default caching mode. A client can change the default caching mode for its queries by using this statement:

SET query_cache_type = val;

val can be 0, 1, or 2, which have the same meanings as when setting the query_cache_type variable at server startup. In a SET statement, the symbolic values OFF, ON, and DEMAND are synonyms for 0, 1, and 2.

A client also can control caching of individual queries by adding a modifier following the SELECT keyword. SELECT SQL_CACHE causes the query result to be cached if the cache mode is ON or DEMAND. SELECT SQL_NO_CACHE causes the result not to be cached.

Suppression of caching can be useful for queries that retrieve information from a constantly changing table. In that case, the cache is unlikely to be of much use. Suppose that you're logging Web server requests to a table in MySQL, and also that you periodically run a set of summary queries on the table. For a reasonably busy Web server, new rows will be inserted into the table frequently and thus any query results cached for the table become invalidated quickly. The implication is that although you might issue the summary queries repeatedly, it's unlikely that the query cache will be of any value for them. Under such circumstances, it makes sense to issue the queries using the SQL_NO_CACHE modifier to tell the server not to bother caching their results.

Hardware Issues

The earlier part of this chapter discusses techniques that help improve your server's performance regardless of your hardware configuration. You can of course get better hardware to make your server run faster. But not all hardware-related changes are equally valuable. When assessing what kinds of hardware improvements you might make, the most important principles are the same as those that apply to server parameter tuning: Put as much information in fast storage as possible, and keep it there as long as possible.

Several aspects of your hardware configuration can be modified to improve server performance:

Install more memory into your machine. This enables you to configure larger values for the server's cache and buffer sizes, which allows it to keep data in memory longer and with less need to fetch information from disk.

Reconfigure your system to remove all disk swap devices if you have enough RAM to do all swapping into a memory filesystem. Otherwise, some systems will continue to swap to disk even if you have sufficient RAM for swapping.

Add faster disks to improve I/O latency. Seek time is typically the primary determinant of performance here. It's slow to move the heads laterally; after the heads have been positioned, reading blocks off the track is fast by comparison. However, if you have a choice between adding more memory and getting faster disks, add more memory. Memory is always faster than your disks, and adding memory allows you to use larger caches, which reduces disk activity.

Take advantage of parallelism by redistributing disk activity across physical devices. If you can split reading or writing across multiple physical devices, it will be quicker than reading and writing everything from the same device. For example, if you store databases on one device and logs on another, writing to both devices at once it will be faster than if databases and logs share the same device. Note that using different partitions on the same physical device doesn't count as parallelism. That won't help because they'll still contend for the same physical resource (disk heads). The procedure for moving logs and databases is described in Chapter 10, "The MySQL Data Directory."

Before you relocate data to a different device, make sure that you understand your system's load characteristics. If there's some other major activity already taking place on a particular physical device, putting a database there may actually make performance worse. For example, you may not realize any overall benefit if you process a lot of Web traffic and move a database onto the device where your Web server document tree is located.

Use of RAID devices can give you some advantages of parallelism as well.

Use multi-processor hardware. For a multi-threaded application like the MySQL server, multi-processor hardware can execute multiple threads at the same time.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020