Clustering is part of grid computing methodology, by which several low-cost commodity hardware components are networked together to achieve increased computing capacity. Scalability on demand is achieved by adding additional nodes and distributing the workload to the available machines.
Scalability and application performance improvement can be done via three methods:
- By working harder
- By working smarter
- By getting help
Working harder means adding more CPUs and more memory so that the processing power increases to handle any amount of workload. This is the usual approach and often helps as additional CPUs address the workload problems. However, this approach is not quite economical, because the average cost of the computing power does not always increase in a linear manner. Adding computing power for a single SMP box increases the cost and complexity at a logarithmic scale. Also, the performance (and scalability) is often constricted by the bottlenecks in the infrastructure layer, such as the available bandwidth and speed of the wires connecting storage to servers.
Working smarter is accomplished by employing intelligent and efficient algorithms at either the application layer or storage layer. By introducing “smartness” in the storage layer, you can greatly reduce the total amount of work to be done to achieve the desired results. Working smarter at the application layer often requires rewriting the application or changing the way it works (or sometimes changing the application design itself), which is quite impossible for a running application and requires unacceptable downtime. This option sometimes becomes almost impossible for a third-party vendor and packaged applications, because getting everyone onboard can become a tedious and time-consuming task.
Working smarter at the storage layer is accomplished by introducing intelligent storage servers, where the storage servers offload some amount of processing. This requires specially designed storage servers such as Oracle Exadata Storage Servers (used in Oracle Database Machine) to process some amount of critical workload close to storage. Processing the workload close to where it is stored greatly enhances the performance of the application because it largely limits the number of roundtrips between the storage and hosts as well as limits the size of data transferred to the database cluster infrastructure.
Getting help can be as simple as using other machines’ computing power to do the work. In other words, getting help simply involves clustering the hardware, using the spare processing capacity of the idle nodes, and combining the processing transaction results at the end. More importantly, this approach does not require any changes to the application because it is transparent to the application. Another advantage to using this approach is that it allows on-demand scalability—you can choose to get help whenever required, and you do not need to invest in massive hardware.