Scalability Design Patterns
I found Kanwardeep Singh Ahluwalia’s paper Scalability Design Patterns via High Scalability (and is another one I have been hauling around ever since). High Scalability is one of those sites that testers should be following. All sorts of ideas pop up there.
First, two quick bullets.
- Scalability is a desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner, or to be readily enlarged
- Amdahl’s Law
And now the patterns themselves (which he mentions are best suited to transactional systems. YMMV). Note that there is a bit of implicit ordering with the parallelism ones. See the graphic in the paper to see it clearly. Also, some patterns don’t really seem to be a pattern; a meta-pattern? A pattern container?
- Optimize Algorithm – Identify tasks which can be completed in a shorter period to save processing time
- Add Hardware – Adding hardware to the existing physical node will result in what is commonly known as “vertical Scalability”, where as, adding hardware as a separate new node will result in enhanced “horizontal scalability”
- Intra-process Parallelism – Multi-threading will allow the process to make use of multiple cores, CPUs and/or hyperthreading
- Inter-process Parallelism – The system needs to replicate its process by spawning their multiple instances. All these multiple instances need to coordinate with each other to handle the load in a distributed manner. These processes can coordinate with each other with the help of a load balancer, which helps in assigning the task to each process. [The] optimum number can be determined by increasing the number of processes gradually and then observing the gain in the scalability.
- Hybrid Parallelism – Spawn both multiple threads as well as processes
- Optimize Decentralization – All such bottlenecks should be avoided by following decentralized approach, where in processing is not dependent on a particular resource, instead multiple resources are provided to make each parallel path independent enough not to be burden or dependent on the other path
- Control Shared Resources – Shared resources should be categorized in to “Access Only” and “Modifiable” resources. The most common solution to prevent the corruption of shared resources is to capture a lock on the shared resource, modify it and release the lock.
- Automate Scalability – The system needs to have a monitoring entity that measures the current throughput and has the ability to increase or decrease the number of threads or processes in the system.