I recently came across a question on a start-up forum that asked this question –
"Please help understand which hosting is better (in terms of cost, security & privacy of data when one has to store a lot of private customer data), cloud or in house?"
This is a pretty good question, because it essentially clarifies that cost is not the only consideration – a lot of startups don’t really think of security/scalability/uptime etc. when they start out and these things can come back later to haunt them – some decisions can be reversed easily, others not so much.
So, as far as my experience goes, here is my attempt at listing the various factors to consider before making a decision.
First of all the various choices –
Shared Hosting – includes Bluehost, Hostgator, GoDaddy and a whole lot of other providers. Cheapest to start with, although you don’t get a lot of resources – great for low traffic web sites like starter blogs, personal web site, etc. Comes with lot of restrictions (in terms of what you can and cannot use) and soft limits (especially CPU and memory throttling), so any web app with some serious usage will outgrow this quickly.
Generic Virtual Machines – There are many here, for e.g. Rackspace (great for both Virtual Machine and dedicated hosting). These work out to be much better, give extra support, and the bandwidth cost is much lesser compared to most of the cloud providers. Linode is also great for Linux VM hosting.
Dedicated Hosting – A vendor such as Rackspace allocates specific hardware only for you and gives you remote access to it. They will be responsible for the hardware uptime, you will be responsible for application uptime and you or they will be responsible for OS uptime (depending on the support plan you choose). The advantage here is that you get exactly what you want in terms of hardware specs, configuration setup, etc, combined with the expertise of the hosting provider. This turns out to be cheaper than Cloud hosting if you can predict your loads accurately and they are more or less stable (not spiky). If not though, this can be expensive since you might tend to over-allocate resources and underutilize (since the alternative, i.e. over-utilizing is more trouble-some).
Cloud Infrastructure-As-A-Service – Here, you can go with any of the cloud providers (Amazon Web Services, Windows Azure) and focus mainly on the virtual machines – you buy raw capacity (not the hardware, just the abstract notion of capacity in terms of CPU, memory, disk space) from them with some OS level abstraction, but are responsible for installing and maintaining your own stack. Great for scale out and if you want complete control over how your software is configured, but does take that extra bit of time from you. This is somewhat different from Virtual machine providers because of the extent to which you can scale, the number of geographic locations available, but the underlying concept is the same – you get Virtual machines not hardware access.
Cloud Platform-As-A-Service – More choice here, including the earlier ones (Amazon Web Services and Windows Azure, but also Heroku, AppHarbor, AppFog, AppEngine, etc.). These also provide and maintain software stack along with the hardware capacity and OS, and a lot of times you just upload your app and expect it to work – great if you can give up some control over stack configuration, in exchange for freeing up some time and maintenance headaches.
Most of these also allow you to worry about your application logic, and even uptime, replication, failovers, auto-scaling, backups can be taken care of by the provider.
Co-located hosting – here you will buy a server, and will be responsible for the OS, but you can just put it in a datacenter near you – they will charge you for maintaining the hardware and providing electricity, space, cooling, internet and, depending on your contract, even hardware maintenance. I think BSNL provides that in India, there maybe others (not sure, others can pitch in)
Self-hosted – this does give you maximum power, but also max work – you don’t pay money to others (other than buying hardware) but you do have to account time lost/additional staff for taking care of this work. However scaling here can also allow you to bring in hardware level optimizations, especially if you think your software has unique requirements and you can tune the entire stack better than someone else can. You still have to worry about disaster recovery and failovers though and might choose one of the above options for that (unless you have offices in multiple geographies that can reduce this risk for you).
So what are the things that can help you decide?
1. Your stack and OS needs – if you don’t have any special needs and are using any of the popular web programming languages such as PHP, Python, .NET, a great progression is start using shared hosting, then upgrade to platform as a service – both of them abstract away the maintenance of the underlying hardware resources. PAAS providers are generally bigger feature sets as well (for eg separate worker/web roles, better database choices etc.) so you may skip the shared hosting part in some cases (for e.g. you want to use PHP with MySQL – you can start with Bluehost/GoDaddy, but if you want to use CouchDB, that may not be a great option).
Same time if you have specific configuration needs (say you are porting a legacy app, or there are some not-so-often used stacks you are using because it fits your use case well), then you might want to just start off with Virtual machine hosting and then upgrade to Cloud Infrastructure-as-a-service or Dedicated Hosting.
2. Your budget vs. timeline constraints – Higher budget but lesser time means you might want to outsource as much as possible so that you can focus on getting your stuff done fast – this could mean either Cloud hosting or Managed dedicated hosting. On the other hand, lesser budget might result in you trying to reduce cash out-flows and stretch precious cash out – this can help if the main bottleneck during this time is not your time.
3. Your preference – max flexibility with maybe more work (IAAS, self-hosting, virtual machines) vs. less flexibility but minimum work (PAAS hosting)
4. Your development and operational practices – This is where some of the PAAS providers really shine. For e.g. in AppHarbor, you can just do a git-push and this deploys your code to the prod. The service will even run all the automated tests before final deployment, rollback the deploy if there is any failure. Some of these can be extremely time consuming to setup and maintain if you try to go for an in-house or VM-based solution.
I would go for –
1. Just starting out with something, price is the biggest constraint – shared hosting
2. Upgrading from Shared hosting – PAAS
3. Want to control my own stack, but price conscious – VMs/IAAS if the demand cannot be foreseen (which is true in most cases) – IAAS gets preference whenever there is spiky load and I need auto-scaling with hourly billing instead of monthly
4. Used IAAS for some time, have stable or foreseeable demand, with no much spikes – Dedicated hosting, with option to plug-into cloud service such as AWS whenever needed (spikes). In-house hosting only if I am partnering with other geeks who know their hardware well and don’t cringe if they have to build their servers other infrastructure from scrap themselves.
About security – AWS has pretty good security certifications, and you can also do things such as VPN instead of keeping it a public network. However this depends on how secure you want it to be – for instance do you mind if you don’t know where your data is saved? Some banks cannot save their data outside of their country so this could be a problem. What’s the cost of data exposure for you or your customers? Are you storing financial/personal data? Do the laws of your country demand something specific about the kind of data you are storing?
For e.g. I read somewhere (correct me if I am wrong) that if you save Credit card info, you should be hosting it yourself – you cannot out-source hosting (besides there are security certifications that you need to pass). Is this really necessary? Can you just outsource the whole payment management (including saving credit card info) to a third party? These are architectural decisions that need to be made that will also determine where you can deploy your app.
Hopefully that should give some ideas about your options. Have I missed something? Let me know.