[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [SAGE] number of eggs in a basket



There are many different kinds of mechanisms for building failsafe 
systems, and which one to choose for a particular service depends on how 
critical the service is, how much of an outage you can stand (none, 
seconds, minutes), and how much money you have.

For many web services, you can use hardware or software products that 
allow you to have multiple hosts providing the service, and when one 
host goes down there is no outage whatsoever (other than possibly a lost 
transaction-in-progress). I can, of course, recommend the hardware 
solutions built by my own company, but there's lots of viable 
competition in this market (:-).

Services like DNS have inherent redundancy capabilities, but failover 
usually takes about 4 seconds to point at a secondary server, and that 
may be too much for some applications. Again, hardware load sharing 
devices can be used to split the load and automatically fail over if 
necessary. As Steve points out, however, if the data gets corrupted and 
replicated to all of the servers, this kind of redundancy will not keep 
you running - the same is true with clustered file servers. Other 
methods of testing and non-clustered backups are needed if you suffer 
from the fat-finger or other data corruption problem.

The ARP cache problem can be avoided by adding a hardware ethernet 
interface and zapping the MAC address to match the failed service (or 
playing tricks with software products like VMware to do the same for a 
virtual machine), but there are other gotchas - if you are connected to 
a network switch, there may still be a small outage while the switch 
realises that this MAC address has move to a new port.

You can also play tricks by broadcasting an ARP reply that convinces all 
of the hosts to move to your new MAC address - a technique that can be 
used for good or evil...

I agree with Steve's recommendation for the use of dedicated 
nameservers. Start watching the instantaneous CPU and packet loads on 
such servers (don't look at one- or five- minute averages) and you'll 
see some real peaks, especially at 'witching hours' if you have a lot of 
machines with cron jobs that are too well synced. (I just finished 
working on such a problem in our own environment - default 'systat' cron 
jobs on hundreds of  Linux servers that all go off every 10 minutes, on 
the 10 minute marker, all insisting on grabbing the 'group' map across 
the network simultaneously - yuckk!)

- Richard


Brad Knowles wrote:

> At 10:46 PM -0500 2005-01-07, Steve Simmons wrote:
>
>>  With DNS, any machine that has a copy of the zone files and can do
>>  virtual IP interfaces will work just fine.  When primary DNS dies,
>>  you activate a virtual interface with the address of the DNS server
>>  and manually start DNS.  Instant restoration of service, cheap and
>>  easy.
>
>
>     It depends on what you mean by "dies".  There are certain kinds of 
> failures that may happen within the application which would be 
> propagated to all the secondaries, and would not necessarily be 
> detectable by the monitoring system.
>
>     Also keep in mind that you will have ARP cache timeout issues when 
> you activate the virtual IP address.  Better to have multiple service 
> IP addresses that are advertised via servers which are designed for 
> this kind of failure, and a monitoring/fail-over system that can help 
> minimize the ARP cache timeout issues, etc....
>
>     When you're talking about this sort of thing, you really want to 
> use dedicated nameserver boxes if at all possible.  Being a nameserver 
> can be a very heavy load on a machine, even if it seems that the 
> machine isn't actually doing much work.  This can seriously impact the 
> other services on the machine, if you try to run it on a shared box.
>
>>  With web service, it depends strongly on whether your raw HT files
>>  are on the local disk or a server, if you access thru a DB or not,
>>  what kinds of certificates you need, what kind of software you might
>>  have that is or isn't too expensive to duplicate, ya-da, ya-da.
>
>
>     Any time you start talking about databases, NIS, etc... being 
> shared across the network, you have to be real careful when you talk 
> about putting NFS into that mix.  For one thing, Berkeley DB (and most 
> other database systems) tend to use mmap() system calls for maximum 
> performance, and mmap() cannot be used on NFS.
>
>     This can cause problems in all sorts of unexpected areas.  For 
> example, Cyrus can't be used on NFS, because it stores the mailbox 
> meta-data indexes in Berkeley DB, which uses mmap().  You can't store 
> diablo USENET news server indexes on NFS, because it also uses mmap() 
> for speed (storing the articles on NFS is a different matter, but does 
> have it's own issues).
>
>     NIS, LDAP, and other network information database systems may also 
> make use of Berkeley DB as the back-end, or otherwise make use of 
> mmap().  Note that MySQL can also use Berkeley DB as the back-end 
> (which is how they achieve full ACID compliance for their MaxSQL 
> product).
>
>
>     Anyway, it all comes down to this -- YMMV.  You need to do the 
> cost/benefit analysis and figure out what works best for you.
>
>>  Things get a bit more complex but still manageable when you start
>>  putting multiple services on a single machine and still want to
>>  fail them over.  In a case like that, I'd allocate a name and
>>  address for each service: dns, www, dhcp, etc.
>
>
>     That works up to the point where you need to reboot the box, at 
> which point all these things go down at once.
>
>     Server partitioning schemes (e.g., such as found on certain types 
> of Sun Enterprise class servers for Solaris 9 and in Solaris 10 for a 
> wider array of boxes) can help make this process easier by allowing 
> you to reboot the virtual box while the physical hardware will 
> hopefully remain operational.  Again, that only works to a point.
>
>     The next step would be to go to blade systems, where you really do 
> get entirely separate machines that all share a common chassis. Of 
> course, that has its own failure modes.
>
>>  With a situation like that, you put as many services on a pair of
>>  machines as they can handle.  The services that can't be paired
>>  you put on individual machines.
>
>
>     That assumes that a pair of machines is sufficient to handle the 
> load, and the second machine wouldn't melt if the primary died.  This 
> may be a valid assumption in some cases, but not in others.
>
>>  Most of us have or will soon have limited power and cooling capacity
>>  in our server rooms.  One server per service can suck up a fair
>>  amount of capacity.
>
>
>     The bigger server-class machines can also suck up a lot of power 
> and cooling.  You need to find out what is the right balance for you.
>
>>  Server-class machines aren't cheap.  But if you can run a bunch of
>>  services on a pair of machines, you don't need server-class machines.
>>  Just machines fast enough for those services.
>
>
>     They may be a lot more expensive to own and operate, as well as to 
> upgrade.  You need to look at all the pieces in the cost/benefit 
> equation and decide what works best for you.
>
>>  Conversely, if you put those services on old PCs running Linux, they'll
>>  have all the reliability of old PCs.  IMHO better to buy a couple of
>>  good ones.
>
>
>     Server-grade machines doesn't necessarily mean Sun boxes.  You can 
> get server-grade PCs, too.  You just have to look harder.
>
>>  Fewer machines to maintain and admin means lower cost.  Granted, it
>>  may not be a *lot* lower, depending on the rest of your environment.
>>  But it will be lower.
>
>
>     Maybe.  It might raise your costs.  It all depends on your 
> particular situation.
>
>>  YMMV, offer only available to adults over 18, not valid where 
>> explicitly
>>  prohibited by law, etc, etc.
>
>
>     That's the key -- YMMV.
>