[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [SAGE] number of eggs in a basket
At 9:40 PM -0700 2005-01-07, Ruth Milner wrote:
> What I said was that it was "10x more likely that some sort of
> hardware or *system software* problem will take out a service".
I'm not convinced. If you have N+M load-balanced/fail-over
clusters, the probability of the entire service being taken out by a
single hardware or system software failure should approach zero.
> I didn't put a number on the overall *total* number of failures,
> only that it would increase (which is absolutely the case). The
> cause doesn't really matter, though: if you have 10x the number
> of machines for sysadmin missteps to be made on, then the overall
> total failure incidents are likewise going to increase - though
> not necessarily linearly.
If you're using the right admin tools, a site with 100,000
machines may have a lower overall probability of an "admin oops"
taking out a significant chunk of the system as compared to a smaller
site with just 100 machines or even just one machine, if they don't
have the right tools.
> Well, I did say a little more than just that one quoted paragraph.
> :-) My point in that bit was where the idea might come from that
> the number of failures would increase by decentralizing, which at
> least one respondent had questioned.
The total potential number of failures may go up, but if the
system is designed correctly, those should be accounted for and
should not be a visible impact on the overall services being
provided. You should be able to take a hit overnight (or over the
weekend), get notified by e-mail, and then fix it whenever you feel
like getting around to it the next working day. At least, for most
types of hits.
> This does not make decentralization bad; as everyone has been
> saying (including me), it's a complex issue. The point is that
> decentralization also has costs that shouldn't be glossed over,
> especially in a small shop.
Fair enough. YMMV, definitely.
--
Brad Knowles, <brad@stop.mail-abuse.org>
"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."
-- Benjamin Franklin (1706-1790), reply of the Pennsylvania
Assembly to the Governor, November 11, 1755
SAGE member since 1995. See <http://www.sage.org/> for more info.