You used to live by yourself, but then you got some roommates to help with the rent. Now, when you need to use the bathroom at 6 a.m. to get ready for work, they’re in there and you’re left waiting in the hallway wearing just your towel (not a pretty sight at that early hour, I might add). This is the problem of shared resources, which you will encounter with both roommates and machines.
When machines share resources, sometimes there aren’t enough to go around. The ensuing competition is the basic setup for many a troubleshooting problem. As an example within this category, let’s look at one of the most common shared resource situations that produce intermittent failures: electrical circuits.
To prevent electrical wiring from carrying a dangerous amount of current, most distribution networks have amperage-limiting breakers (aka, fuses) in place. For the same reason, many machines have fuses within them to prevent drawing a damaging amount of power.
A single machine on a dedicated circuit has the ability to trip a breaker, but that’s where its impact ends as far as the electrical system is concerned. However, when you start hooking up multiple machines to a common power supply, and thereby create a shared resource situation, the fun begins:
Here we’ve got a refrigerator, toaster, and oven on a shared circuit. Let’s say they have the following power consumption characteristics:
|Appliance||Power Consumption (in amps)|
Most electrical devices will have an average and maximum power consumption, depending on the kind of work being performed. These statistics can be misleading though, because your usage of a device might not be “average.” For these simple kitchen appliances, it’s easy to imagine a variety of circumstances that would vary the power consumption by large margins. For example, if the refrigerator was placed in an unventilated and un-air conditioned room in the middle of a sweltering summer, its power draw would be much higher than in the dead of winter. The oven and toaster will also have large swings in their power consumption based on what kind of work they are doing: the energy required to keep bread warm at 100°F/38°C is much less than cooking a pizza at 500°F/260°C.
You can see from the table that, when operating alone, each of these devices would be fine on a 15-amp circuit: their maximum current numbers are safely below 15 amps. However, when they are all turned on simultaneously, the chance for overloading the circuit becomes a possibility.
Shared resource situations like this can produce intermittent failures. There will be times when these 3 kitchen appliances can be operated simultaneously without incident, and other times when using them together will trip the circuit breaker. If you aren’t aware of the amount of power being drawn and the maximum capacity of the electrical circuit being used, the whole situation will seem like voodoo! Of course, the failure condition isn’t really intermittent once you know what’s going on: if the amount of power being consumed is over 15 amps, the breaker will trip 100% of the time. That’s reliable and predictable, the exact opposite of an intermittent problem.
Dedicated To The Job
You probably saw it coming, but a strategy we can use to alleviate an overconsumption situation is to deploy dedicated resources. Let’s stick with our kitchen appliance example: you can see there’s no combination of appliances (attached to a single 15-amp circuit) that won’t result in an overload, if the maximum current draw is reached simultaneously.
|Maximum Power Consumption (in amps)||Total||Potential Overage|
In the table above you can see the combined effect of these appliances drawing their maximums in various groupings. Depending on which appliances are turned on, the overages range from 2-14 amps beyond what our 15-amp circuit can provide. Someone’s dinner is going to get ruined!
If you wanted to completely eliminate the possibility of intermittent power failures in this clip art kitchen, you will need to add a dedicated electrical circuit for each appliance:
Now, each appliance is isolated from the others with its own dedicated power supply. Even if each one simultaneously draws its maximum rated power, there will be no resource conflict. As a bonus, if one of the appliances malfunctions and temporarily tries to draw as much power as it can, beyond its rated maximum, the other two will continue to function.
Getting Your Fair Share, And Priority When Needed
Another option for preventing resource conflicts is to install a governor (also called a limiter) that enforces a certain level of consumption and makes sure that all machines accessing a shared pool of resources gets their “fair share.” Once an allocation enforcement mechanism is in place, you can develop more sophisticated schemes to adapt to situations where a particular situation requires an uneven partitioning of resources.
What has developed in the field of networking is a classic example of this strategy. Think about a busy cafe that offers free Internet access: during peak times there might be a lot of devices (smartphones, laptops, tablets, etc.) sharing this finite resource. Other times, when it’s just you and the tattooed barista, you’ll have no competition for access to the network. The dilemma is that a typical Internet connection has a fixed amount of bandwidth, but the number of people (and how they use it) will vary widely over the course of a day. I think most cafe owners would prefer that their network automatically adapt to both busy and quiet periods, without having to look over people’s shoulders and yell at them to stop watching videos on YouTube.
A group of technologies called Quality of Service (QoS) has emerged that addresses the type of problem faced by a busy Internet cafe. The basic idea is that every device accessing the network gets its “fair share” of the bandwidth. On top of that, some QoS schemes employ a system of prioritization to ensure that people using the network for certain purposes (like making phone calls) get priority access.
For “fair share” use, the basic idea is to take the finite resource and divide it evenly by the number of consumers. If you had 20 people with laptops sharing a 1000 kilobits/second Internet connection, that would be:
1000 kilobits/second ÷ 20 users = 50 kilobits/second per user
You can also favor certain types of usage with a prioritization system. Continuing with our cafe example, let’s say we’ve determined that people use the cafe network primarily for watching videos, backing up files, surfing the web, checking email, and making phone calls. We absolutely never want anyone to have their phone call dropped. Also, we’d like people to be able to quickly check their email when they’re in a rush. Web surfing and watching videos are important, but not as important as phone calls or email. Then there’s backups, where the files being transferred are large and can take days to transfer over the network. We’ll let people do that, but it should never interfere with any of the other uses mentioned previously.
Now that we’ve determined what’s important, we can make the following traffic priority list:
- Voice calls
- Checking email
- Web surfing
- Watching videos
- Everything else: backing up files, file sharing, etc.
Within a traffic category, we’ll give everyone an equal allocation. Whatever is left gets distributed over the next category in the same way, and so on until all the bandwidth is used up. We’ll also add the stipulation that each category is limited to 50% of the remaining bandwidth: this is to preclude a particular traffic category from completely preventing all other uses.
Let’s say we have 3 people on phone calls (requiring 120 kbps per call), 2 checking email, 10 web surfers, 4 watching videos, and 5 backup programs running. From our rules, the bandwidth would be allocated as follows:
|Traffic Category||Allocation (in kbps)||Remaining Bandwidth (in kbps)||Users||Share Per User (in kbps)|
Starting with our total bandwidth of 1000 kilobits/second, we deduct 3 × 120 kbs per user (360 kbps) for the phone calls. Of the remainder (640 kbps), we’ve decreed that up to 50% (320 kbps) can be used for checking email. From there, half of the leftover (50% × 320 kbps = 160 kbps) goes to web surfing and so on through the remaining categories, until all the bandwidth is allocated. This scheme definitely slows things down, but it ensures that everyone can get their work done (if you can call sitting in a cafe, sipping joe, “work”). Brownouts associated with one user hogging all the network resources are also prevented. Furthermore, it’s automatic and flexible: the network will continue to function in a variety of conditions (busy and quiet) without any intervention or policing on the part of the cafe owner.
Prioritization and “fair share” allocation are universal tactics that can be applied to any resource that is being exhausted and causing you headaches. Back to our kitchen appliances, let’s say that installing additional electrical circuits was not an option. If these appliances had to live on the same circuit indefinitely, we could concoct several schemes to share the electricity. Here are two possibilities (among many) on how to make these machines play nice together:
- Install a 3-way switch that diverts electricity to only one appliance at a time. This will implicitly prioritize usage: the machine getting electricity will obviously be the one most needed at the moment!
- Install a current limiting system so that each appliance is restricted to its “fair share” of the available current. If there’s one machine turned on, it can use the full 15 amps. If there’s two on at the same time, each one can draw a maximum of 15 amps ÷ 2 = 7.5 amps. If all three are turned on, then each is only allowed a maximum of 5 amps (15 amps ÷ 3). Of course the downside is that the ovens may not be able to reach their full temperatures this way. Likewise, the refrigerator may not be able to keep its contents cool if underpowered in certain conditions. However, this scheme will prevent the breaker from tripping and allow the appliances to simultaneously function at a basic level without interruption. Tradeoffs…
To show you that I will courageously state the obvious, let me say that another way you can respond to resource shortages is by…adding more resources. Poof! Was that your head exploding? Returning one last time to our kitchen appliances example, another possible solution would be to upgrade the capacity of the shared circuit from 15 to 30 amps. With a 30-amp circuit in place, even with each appliance topping out at their maximum consumption, you could operate all 3 simultaneously without a failure:
8 (refrigerator) + 9 (toaster) + 12 (oven) = 29 amps < 30-amp circuit
As a practical matter, the strategy of increasing the size of a resource pool has its own complications. For example, in North America, most household electrical circuits are sized in very specific capacities. You can’t just decide you’d like a 17.9-amp breaker because all the mass-produced parts and building codes are based on 15 and 20-amp circuits. Larger capacities exist beyond that, but the outlets and plug types are different so you’d also need to replace your appliances. In networking, the same thing can happen when you want to go beyond a given capacity boundary. At a certain point, increasing your bandwidth may require you to replace copper cables with fiber optics, consumer-grade switches with professional-quality gear, etc.
Most resource pools are like this: there will be constraints that prevent you from adding an arbitrary amount of capacity that will perfectly suit your needs. Instead, you’ll have to add capacity in amounts that are standard to the manufacturer or industry. Often times, stepping up to that next “chunk size” will require costly upgrades that exceed the cost of duplicating what already exists (i.e., adding dedicated resources).
Cost will usually dictate which resource strategy you pursue. The typical tradeoffs are as follows:
- Adding dedicated resources: usually the most expensive option, but it’s also the one that guarantees the highest level of stability and reliability. Isolating a machine with its own resources eliminates the possibility of interactions with other machines, so it’s ideal for those “mission critical” systems that always need to be functional. Besides the up-front cost, the other downside is the required expansion of your infrastructure. This in turn increases the overhead of your operations. Whatever you’re adding (another electrical circuit, another Internet connection, etc.) will require routine maintenance and monitoring on an on-going basis.
- Installing governors or limiters: usually the cheapest option, but at the expense of throughput, speed, etc. In our cafe example, a rate-limiting system ensured that everyone had equal access to the network, but at a much slower rate than if everyone had a dedicated Internet connection. For this reason, governors aren’t a catch-all solution: the slower rate imposed by them can take you under the minimum threshold required to make your systems or business work.
- Increasing the pool of shared resources: depending on what you’re adding capacity to, the cost can vary widely with this option. When it’s a good deal, the cost will be incremental and proportional to the amount of capacity being added (e.g., doubling the capacity will double the cost). However, as previously noted, there are plenty of examples where capacity can only be added in portions that may be much larger than what you need. Upgrading to the next “chunk size” can mean changes in your infrastructure that may exceed the cost of simply adding dedicated resources. The upside to shared resources is better economics and reduced overhead versus managing many small dedicated pools. That’s because it’s easier to worry about one thing versus many things!
*** Questions? Comments? Have a related troubleshooting story that you’d like to share? Feel free to leave your feedback in the comments section below! ***