Selecting appropriate hardware and operating systems is a controversial decision in almost any situation. Discussions can become very heated and subjective. In this chapter, I will attempt to provide an objective overview of the major operating systems, highlighting advantages and disadvantages of each as it pertains to running MySQL server. I do have to make a disclaimer that I am biased in the direction of Linux and open source solutions.
Here is a list of the primary criteria for selecting a high-performance platform for MySQL:
- The size of the user base
- The amount of usage under high load on mission-critical servers
- The maturity of the C/C++ compiler
- The number of MySQL AB developers regularly using the platform
- The degree of standard compliance in the system libraries
- The maturity of the thread library available on the system
Let’s discuss each of these criteria in a bit more detail and explain its importance.
Size of the User Base
Each end user will cause the server to execute its code in patterns that will be unique to the user’s application. As more users try the code, increasingly different ways to execute the code appear. This results in a greater opportunity to uncover obscure bugs and performance bottlenecks. This leads to more bug reports, which in turn leads to getting those bugs fixed, and eventually leads to a high standard of performance and reliability.
Amount of Usage Under High Load on Mission-Critical Servers
Even with a high number of users, certain code in specific combinations is unlikely to be executed unless the system is under very high load. Therefore, certain bugs will never be discovered unless the system is put under high- concurrency execution of a wide range of queries—something that would be virtually impossible to simulate in a test environment. Again, as bugs are reported, they get fixed.
Maturity of the C/C++ Compiler
Parts of MySQL Server are written in C and other parts in C++. Ideally, programmers like to assume that their compiler is perfect and will solve most of life’s problems for them. The reality of working on a project that involves a few hundred thousand lines of code—many of them written by Monty, who does not quite think like compiler test case writers—is that eventually an ugly internal bug on the MySQL code base will generate a broken executable. Additionally, some compilers may produce correctly functioning but suboptimal code, which reduces the quality of the application running on that platform.
Number of MySQL AB Developers Regularly Using the Platform
As most of you already know, the development process includes frequently repeated code plus compile plus test cycles. Therefore, the platform that a project is actually developed on (as opposed to being ported to just before release) will inherently be more stable than its “port-to” counterparts at release time because it will have undergone more testing.
Degree of Standard Compliance in the System Libraries
MySQL assumes that system routines on Unix platforms follow the POSIX standard for the most part. Unfortunately, as experience and the abundance of
#ifdef HAVE_BROKEN_... directives in the source code show, this is by far not the case. Fortunately, the magic #ifdef solves the problem once it is discovered. But no one knows how many problems we are yet to discover on the multitude of systems MySQL AB supports. The large user base here is the key to discovering such problems.
Maturity of the Thread Library Available on the System
MySQL Server heavily uses POSIX threads. Therefore, the stability and performance of the thread library is a critical factor in determining how stable and how fast MySQL will execute on any given platform.
For practical purposes, you can consider a large user base and its willingness to put MySQL in a mission-critical environment on a certain platform as a seal of community approval. Judging by this standard, the most “voted” platforms are x86 Linux, Windows, FreeBSD, and SPARC Solaris. Therefore, next we discuss those systems one at a time, highlighting the strengths and the weaknesses of each, and then have a brief overview of other systems.
MySQL’s success on Linux is not a coincidence; many factors contribute to that success. First, Linux is a popular platform; this alone is a driving factor because it establishes a large user base, thus creating a large test environment for MySQL. Additionally, two aspects of the Linux kernel make it a fertile ground for MySQL: solid thread support and aggressive file caching. Also, MySQL developers tend to run Linux on their desktop machines, which double as personal development servers.
Because of the popularity of Linux, the GCC compiler on this platform has received a baptism of fire through brutal testing by a horde of merciless users who write all kinds of variants of C/C++ code. This makes it likely that the GCC compiler will compile MySQL sources without compiler-specific problems.
Some issues have emerged in the thread library (LinuxThreads) that negatively affect MySQL performance and stability under high load, and these issues have generated quite a bit of inaccurate publicity with claims that “MySQL does not scale.” Fortunately, LinuxThreads is an open source library, and with the help of the developer community, MySQL AB has been able to provide a patch to address these issues.
Although in my opinion Linux is the best platform for running MySQL in most situations, Linux is far from perfect. The current state of virtual memory implementation leaves much to be desired, and the lack of unity among Linux developers has been a great impediment to progress. There exists a large variety of kernel flavors: the “virgin” kernel, kernels patched by Linux distribution vendors (e.g., RedHat and SuSE), and numerous special patches maintained by various groups and individuals that have not been included for one reason or another into the main kernel tree.
None of these flavors is perfect; all have bugs or performance quirks of one kind or another. Some will be stable when running MySQL, and others will not. To make matters worse, some usage patterns of MySQL may expose a bug, while others will not. As of this writing, the MySQL development team concurs that the most stable kernel for MySQL Server is the one from SuSE 7.3.
MySQL is not as stable or fast on non-x86 Linux flavors. Two factors play an important role. The non-x86 Linux user install base is not nearly as large as the x86 one, which in practice means that a lot of architecture-specific kernel bugs that would have been quickly discovered on x86 will not be discovered on a non-x86 architecture by the time you decide to put your system into production. Similarly, the MySQL user base on the non-x86 platform is also small. This means that platform-specific MySQL bugs are not as likely to be discovered and fixed.
Having said that, I believe that running MySQL on a non-x86 Linux machine is still worth a try. Both MySQL and Linux have been written with great consideration for portability, and the code for both is of very high quality. MySQL may run better on Linux than on the native operating system provided by the hardware vendor.
Microsoft Windows and MySQL have a very interesting relationship. MySQL was originally written for Unix, but one of the goals for MySQL is to make it a superior cross-platform database. Windows is a popular platform on both the desktop and server, but it took some pleading from the Windows users to get MySQL ported to Windows; it was done more out of necessity than desire.
Naturally, this kind of relationship is not conducive to success on a platform. However, the sheer size of the Windows user base has had a powerful effect on the progress of MySQL on Windows. The bugs kept getting fixed, and eventually the Windows port became quite robust.
On a benchmark conducted by eWeek magazine in February 2002, MySQL version 4.0.1-alpha running on Windows outperformed DB2, SQL Server, and Sybase; and tied with Oracle. MySQL and Oracle were the only databases that could run the unmodified eWeek test for eight hours. Despite the success that MySQL has enjoyed on this platform, Windows has a number of factors that negatively affect the performance and stability. Although the Windows user base is large, many Windows installations of MySQL are run on a desktop with only one or two concurrent connections and hardly ever running two queries at the same time. There are not as many high-concurrency Windows users as there are on Linux.
The typical Windows installation does not have the bug-tracking tools normally available on a Unix platform (such as a debugger or system call tracer) that will help gather important diagnostic information. Such tools usually have to be purchased and installed separately. Because of that, the quality of a typical Windows bug report is much lower than the typical Unix report.
These comments should not lead you to conclude that MySQL does not run well on Windows. In a family of millionaires, even the poor cousin is quite well off— he just does not have as much as the rest of the folks. If you have to run MySQL on Windows, you will experience a measure of success. However, if you have a choice in the matter, a different operating system is likely to bring you better results.
SPARC Solaris is the most prominent “big-iron” platform for MySQL. Part of the secret of MySQL’s success on Solaris is that Monty had been developing on this platform for quite a while until he decided to switch to Linux for his primary development server. Another factor is that Solaris has a MySQL-friendly thread library. Sun Microsystems’s overall commitment to supporting industry standards also plays an important role.
Part of the standard prerelease quality assurance process involves running MySQL on Solaris under Purify (a commercial runtime analysis tool). Purify can detect common programming errors that might be hard to detect through regular testing.
Solaris MySQL users tend to have a high level of satisfaction with their experience; MySQL generally runs without problems. When problems do arise, standard tools are available to diagnose them. Database and system administrators working with Solaris tend to be skilled, well-rounded, and creative professionals. They tend to produce excellent bug reports that make it easy to track down the problem without ever having to log into the system in question.
The main drawback of running MySQL on SPARC Solaris is the low “bang-for- the-buck” value. Sun hardware is much more expensive than x86 systems of equivalent capacity. In two years of working for MySQL support, I have never logged in to a Sun machine that could outperform my desktop (a dual Pentium 500 with 256MB of RAM running Linux) on MySQL tests. This is not to say that such machines do not exist—this is simply to point out that they are so expensive that none of the MySQL customers I’ve worked with could afford one.
Many MySQL users are having success with FreeBSD in mission-critical, high- concurrency applications. The most prominent example is Yahoo! Finance. Much of what FreeBSD contributes to MySQL success is a solid, rigorously tested kernel code; a stable and efficient virtual memory and file system; and a reasonably sized install base for the operating system itself, which in turn results in a decent install base for MySQL.
If only FreeBSD could borrow the thread capabilities and the user base from Linux, it would probably become the most recommended platform for MySQL. Unfortunately, though, as of this writing that has not happened. FreeBSD threads are still on the user level, which reduces any threaded application, including MySQL, to running on only one processor. Additionally, the thread library does have a number of performance and stability quirks that can end up having a “fly in the ointment” effect on the user experience with MySQL.
Although MySQL will compile and run on a multitude of other systems (AIX, HP-UX, Irix, SCO Unix, Tru64, OpenBSD, NetBSD, BSDi, Mac OS X), because of the relatively small-size install base, it is hard to comment on its stability and performance on those platforms. MySQL AB’s proactive efforts at porting to those platforms are limited to compiling binaries for some of them at release time. The rest of the effort is reactive. None of the developers have used those systems for their development server, so development is done only when somebody reports a problem.
Despite the lack of proactive development effort, the solid design of the MySQL code base bears fruit. It is not uncommon for MySQL to run with a great degree of success on a poorly tested platform. If you already have an idle machine with one of the above-mentioned operating systems, or if you are considering a solution that absolutely has to run on such a system, it is definitely worth a try to see how well MySQL will perform. However, if you are building a new system and deciding on the ideal MySQL platform, these systems would not be the best option.
Operating System Tuning Tips
Next, we discuss the general principles of tuning an operating system on a MySQL server. Because specifics vary widely from system to system, and because systems keep changing at a rate that is impossible to fully track even on a Web site, I will not list specific instructions for each tuning operation; rather, I refer you to the operating systems manual for such details.
To maximize MySQL performance and stability:
- Apply the latest vendor patches to the kernel and to the system libraries. In case of Linux, make sure you are running a kernel with good reputation (e.g., the one from SuSE 7.3).
- Some operating systems have a tunable file cache. If this is the case with your system, make sure that enough memory is allocated for it. You may want to run benchmarks on your data as you play with the file cache settings.
- MySQL depends extensively on the underlying file system performance. Experiments with file system settings as well as trying out different file systems may produce significant differences in the overall performance of MySQL.
- If the data does not fit entirely into RAM, your application to a certain degree will be disk-bound. If this is the case, you may try to tune various parameters in the operating system that affect your disk performance.
- Make sure that enough file descriptors are allocated for the use of the MySQL process.
- On some systems (e.g., Linux), a thread is accounted for as a separate process. On such systems, you need to ensure that the MySQL process is allowed to create the necessary number of child processes.
- Avoid putting your data directory on an NFS (Network File System) volume and put it on a local disk instead. If you absolutely have to, use InnoDB tables instead of MyISAM—the InnoDB table handler caches both keys and data in RAM, whereas MyISAM caches only keys relying on the file system disk cache to cache the data. File system caching is not very effective with NFS because the disk cache is flushed on every system write.
- If you are not sure about a certain setting, do not be afraid to benchmark before you spend too much time researching. Your research will be more meaningful after you have some benchmark results. Remember that modern-day hardware and software is so complex that it is easy to make a mistake in trying to theoretically predict how it is going to behave under certain circumstances. It is much easier and much more productive to simply simulate those circumstances and see what happens. Luckily, computer science is different from chemistry in that you will not blow up your lab when something goes wrong with your experiment.
Hardware Tips for MySQL Database performance
In this section, we address general principles of hardware selection and configuration as it pertains to MySQL; we cannot discuss the specifics because they are changing at a pace that is impossible to follow. We advise you to study the specifics and apply these principles when configuring a system for MySQL Server:
- If the data fits entirely (or at least mostly) into RAM, the CPU cache size becomes an important factor in MySQL performance. On one test, the cache increase from 512KB to 2MB without increasing the CPU speed improved MySQL performance by 60 percent.
- Do whatever it takes to fit your data into RAM. In terms of hardware, this means buying more RAM. Buy as much RAM as you can reasonably afford.
- Buy the fastest RAM possible.
- Before you consider buying a faster disk, see if you can buy more RAM with that money. If your application becomes disk-bound, a faster disk will give a speed-up of the factor of 2-3 or so. Keeping the data in RAM can speed up the application 100 or more times.
- Will adding more processors help? It depends on the system. On Linux and Solaris, it will. On FreeBSD, it will not. On Windows, it should. On other systems, it might if you luck out with the vendor thread library—if it is tuned for frequent short critical regions, adding processors will help; otherwise, it will make things worse. (No joking here—I have seen cases where getting rid of an extra processor improved MySQL performance.)
- How many processors can MySQL take advantage of? MySQL scales well on x86 Linux, SPARC Solaris, and HP-UX 11 with up to four processors. It does not scale well on x86 Linux with eight processors: the performance is virtually the same as on a four-processor machine. I am not aware of any benchmarks on non-x86 systems with more than four processors. It is recommended that you consider using a cluster of low-cost machines with replication when performance requirements exceed the capability of one server.