Saturday, July 5

Confusing The Hell Out Of The MySQL Optimiser

Since I'm more academic and theorist than implementer, I made the decision to use as many good principles as possible whilst designing the database underlying my new project.

"Good" is subjective of course, but in this case, amongst other things, it meant placing the database in as high normal form as possible (although since that's all terribly confusing I'm not entirely sure how high I currently am; somewhere between 5th and 6th I reckon), to not access any tables directly and instead use views-on-views and to only allow updates via stored procedure calls.

Most of this, although time consuming, is easy. Normalising data is trivial if you really know the data you're trying to represent, and writing update procs is as simple as writing the update statement itself. However it was the use of views which proved to be the biggest pain for me so far.

Now I must admit that I never took this approach in any of the work I did using MSSQL. For one thing I wasn't using a database of my own design: it wasn't completely normalised and so views weren't used. Secondly, it wasn't insisted upon in my place of work and so I never got into the habit. Still, I suspect that MSSQL would have no problem in the use of nested views.

MySQL, the database system I've used for the back end of this project seems to have big problems dealing with views, most of them to do with optimisation. Take my most recent problem for example. I have a complex view (where "complex" means it's been defined on further views) which for my test data returns a single row in 16ms. If I try to join this on itself using a single join criteria, I'll have to wait over an hour for any results to come back.

My first thought was that the join was occurring inefficiently - that table scans were happening recursively or something. Indeed the EXPLAIN for the query (used to display the execution plan for a query) indicated that keys were sometimes not being used in any of the joins (remember, the view itself had many joins within it).

This was a bit of a red herring though. For one thing, the EXPLAIN itself was taking an hour to return. For another, my test data was trivial with no table containing more than 10 rows. Even the slowest look up shouldn't take an hour to complete. Further research revealed that the optimiser was clever enough to skip using keys if table data is small, hence the results of my EXPLAIN.

So I turned to the server itself. According to its connection status screen, my query was stuck in the STATISTICS state for the bulk of the time which it was stalled. A quick search revealed this to be the stage where the optimiser tries to determine the best execution plan to follow. In short, MySQL spent an hour thinking of the best way of running my query, and then spent 16ms (or whatever) actually running it.

I'm not sure of the details, but it seems that some queries can confuse the optimiser so much that it ends up considering every possible order of any joins in them. I reckon that there were over 10 joins in all within my views, resulting in over 10! or 10000000000 possible execution plans. In a normal optimisation process, many of these would be rejected as being obviously slow, but just not in this case - most probably due to the number of views involved.

You can force MySQL to stop looking after a bit by setting optimizer_search_depth to a value lower than its (incredibly high in my opinion) default of 62. Setting it to 4 allowed my query to complete in 20ms or so. The down side is that the truly optimal path could be missed, something that could be a major issue once my data grows sufficiently. For now I'll leave it on its "auto" setting of 0.

So an unlucky bug then? Perhaps. It's worrying though, since I don't believe that what I'm doing is that advanced or undocumented - someone should have had this problem before, and it should have been fixed then. More scary is the advice I was originally given when asking for possible solutions to my problem; that was to get rid of my views and access the base tables directly. Such advice totally misses the point, and I'm hoping bad design won't be a prerequisite for using MySQL.

No comments:

Post a Comment