So you need to reset your standards.
I'm not sure we do. The site was in need of some maintenance on both the server and software side. It didn't pan out as well as hoped. I'm also well aware of SLAs having architected solutions with high availability requirements etc. However, that isn't really required here IMO. We chew the cud on topics more or less peakoil related and there are some bespoke integrations required between various components of the site. Yes, it was mildly offputting, but no one lost any business over it and Admin finally managed to get some work done that had been needing doing for a long time. I'm sure next time he'll have learned from what happened this time (and don't forget he has a life too).
If there is anyone else who thinks the site needs to implement some form of high availability please speak up. Admin had been trying to find a developer to help him on some specific stuff for a long time and was unsuccessful as it was pretty niche integration work. And there were unexpected problems with the new hosting infrastructure. Having root access to real servers in a highly available environment with strict SLAs with the hosting provider would immediately put the whole thing way beyond current pricing to the extent that I don't think it would be financially feasible.
Having one virtualised server with code and data backups (and a dev environment) seems the best compromise for a site such as this. Generally all work is initially done on the dev environment, but things broke when changing the production server. It wasn't something that Admin could have expected, but I'm sure next time he'll have a user acceptance period first for the new server (but if the problem is load-based, we wouldn't necessarily catch it unless we ran stress tests).