This video from Google is awesome! It gives a good summary of MySQL performance tuning in 45 minutes. For those who want to get the quick digest from this video. I have put down some notes below. Apart from this, I also put down the background information for some topics to make it more completed in my blog. Enjoy!
Â
Transcript:
Benchmarking Core Concept
- Set a target with baseline as starting point. Change 1 thing at a time to see how it affects the performance. At the end, record the configuration and result for tracking later.
- Disable query cache via giving it size=0 to avoid the cache affecting the result. However, it is OS cache can aslo skew the result. To minimize its impact, you can pump up # of runs.
Profiling Concept
- Explain plan (Access type? path?)
- Use the Slow Query Log (threshold to log, no index log) and mysqldumpslow
- Concentrate on the stuff that gives you bigger impact
- Use mytop to catch locking and long-running queries
Sources of Problems
- Poor indexing choices
- Inefficient or bloated schema design (don’t denormalize at the beginning)
- Bad coding practices (MySql has some subqueries issue, in general using join is better)
- Server variable not tuned properly (time consuming test)
- Hardware and/or Network Bottlenecks (IO & CPU bound, Network latency)
Index Guidelines
- Poor or missing index is fastest way to kill a system
- Covering index (MySql can get all the information from the index record which is slimmer than data record. Use this information to complete whatever the queries are. The slimmer the index, the more you can fit into single index block, the fewer reach you going to do, the faster it is).
- Ensure good selectivity on index fields. Cardinality = 1 (unique). Lower selective column can attach to other columns to become multi-column indexes.
- On multi-column indexes, pay attention to order of fields within the index definition.
 - clustered index organization
 - PK (product_id, tag_id)
 - Innodb vs myism - Database grows, make sure distribution is good.
- Remove redundancy indexes (mysql doesn’t check). Update key value and insert will touch indexes (overhead).
Database index is similar to the book index that helps you quickly locate the information you want. Without index, you need to do table scan to find out a set of records that match - O(n). However, to maintain a separate list of indexes’ values and keep them updated as your data change (insert/ update/ delete), you introduce overhead as well. So, we don’t index every column in a table. Some of the indexes have more than 1 columns (multi-column). Such indexes can improve the query speed if you often query all columns together in the WHERE clause or if a single column doesn’t have sufficient variety (not selective as >25% of the rows have the same value in this column - table scan is better for that). For example, you create multicolumn index for last_name & first_name that gives you fewer rows than just look at last name. If you create 2 indexes instead of multicolumn index, it will not be helpful as MySQL won’t use them at the same time. In fact, MySQL will ever use one index per table per query except for UNION. To choose the index, MySQL will pick the one that gives fewer rows back based on some statistics.
Schema Guidelines
- Use the smallest data types needed -> narrower index records -> more records per block -> fewer reads -> faster
- Choose a small clusting key since it is appended to each secondary index record
- Don’t use surrogate keys when a naturally occuring primary key exists
- Consider horizontally splitting many-columned table where some columns are rarely accessed or some columns has large size but others are more frequently accessed. Split and join by 1×1 is more efficient.
- Consider vertically splitting
Coding Guideline
- Use stored procedure for a big performance boost
- Innodb doesn’t optimize select count(*) query b/c multiversioning (Use counter table - increment it when insert)
- Isolate indexed fields on one side of equation. Use application to insert a day instead of using current day.
- LIKE ‘%.com’ cannot use index (use reverse email address and put % on the right).
- Set-based programming vs procedural programming
- Don’t try to outhink the optimizator.
Get rid of correlated subquery
select p.name,
 (select MAX(price) FROM OrderItems WHERE product_id = p.product_id) AS max_sold_price
FROM Products p;vsselect p.name, MAX(oi.price) AS max_sold_price
FROM Products p
INNER JOIN OrderItem oi ON p.product_id = oi.product_id
GROUP BY p.name
Derived Table is better than Correlated Subquery
Server Parameters
- per global vs per thread (sort buffer size)
- Query Cache (off by default): READ intensive (turn it on) but not for write intensive
- key_buffer_Size != innodb_buffer_pool_size (50-60% of machine memory should allocate)
- Memory is cheapest, fastest, easiest way to better performance





































(4.75 out of 5)
(4 out of 5)
No Comment Received
Sorry the comment area are closed for non registered users