Cloud Computing with Amazon EC2
Audience criteria: This article is written for beginners in cloud computing so if you are a pro I would recommend you to skip this article
Wherever I go I hear people around me using the term cloud computing but their understanding of the concept is barely Minimum. Here is the popular youtube video which shows how much cloud computing is being misunderstood due to its catchy name.
An unorthodox definition to cloud computing would be “ The method of computing in which the physical machine which processes or gathers your data is present in the cloud. ” Cloud is nothing but The Internet
Note: The above mentioned definition is completely unorthodox and you’ll definitely not get 2 marks if you write this answer in Anna University examination.
Example for cloud computing:
Note: This is a very basic example, if you have a basic idea of cloud and its example i would recommend you to skip this paragraph.
Knowingly or unknowingly every one of us use one or the several features of cloud computing in a day-to-day routine, the best and simplest example for a cloud based service is Gmail. The service offered by Gmail is SaaS [ Software as a Service]. It means Gmail offers its software for people over the cloud. Now lets do a bit of substitution, what software does Gmail offer? A web based email client. What is a cloud? It’s the Internet. So after substituting the answers to these questions we get this “ Gmail offers its web based email client over the internet. “ All the emails in your account are physically stored in google’s data centers which is in the cloud aka hooked to the internet and you are accessing it from your computer via internet.
Why Cloud ?
I’ll walk you through this question using a case study. Lets take Anna University results publication as our example. http://results.annauniv.edu is the web server maintained by Anna University (Ramanujam Computing Center, i guess) for publishing its affiliated college’s semester examination results. Every one will experience the bottleneck effect at least for the first 6 hours right after the results are published. Though Anna University hardly cares about this effect, lets presume Anna University as a more of student friendly university and it plans to do something about this bottleneck effect during results.
The problem here is too simple to identify, the HTTP requests from ferocious students eagerly expecting their results flood the TCP queue of the server (See DDoS). So a logical solution is to add a Load Balancer and widely distribute the requests to some 3 or 4 servers depending on the results [Some times 3rd to 7th semester results are published at same time, which will definitely need more than 2 servers to handle the load; one the other hand some times these results are published individually so 1 or 2 server would be enough to handle the requests at peak time.]
Solution A proposed by some leet Admin at Anna University:
Buy a Load Balancer – Apprx cost $2000
Lower High-end Core i7 server with 6 Gigs RAM X 4 – 900$ X 4 = 3600$
Power charges for these machines – xxx$
Consolidated cost = 5600 $ + xxx$ per month + Network Maintenance charges + Server Upgrade Charges ~ 7600$ per year
Solution B – The cloud way
2 * Rent a Extra Large Hi CPU on Demand Instance from Asia Pacific Zone for the first 24 hours -0.76$ per hr
Change the Extra Large Instance to Small on demand instance for the rest period – $0.095 per hr
EBS storage – $0.1 per GB – month. At the max our data won’t cross 10 GB mark so lets assume – $1/mo
Imagine EBS as a block of storage (HDD) attached to your instance
Load Balancer cost – 0.025$/hr
Consolidated cost = 36.48$ + 68.4$/mo + $1/mo + 0.6$ = 37.08$ + 68.4$/mo
Results are published twice a year, so 2 days in a year we will be needing the 2 Extra Large instance with load balancer and for the remaining days a small instance is enough for handling the traffic.
For a year the cost will be = 2 * 37.08$ + 68.4$ * 12 = 74.16$ + 820.8$ = 894.96$
Solution B can serve the University 7 to 8 years with the cost spent on solution A
Pros on Solution B
- Terrible cost cutting method
- Completely scalable architecture
“ Say if the university decides to publish the result for 3rd semester students alone separately then according to solution A we can’t save anything as the hardware is already up and running, but according to solution B we can opt for a Medium Hi CPU on Demand Instance or something more corresponding, we could add and remove memory just in few clicks or even we could automate the process using amazon’s api.”
- Going Green. ( I Don’t want to speak more about it in this post )
Cons on Solution B
- Remember the Virginia Zone Amazon Data Center’s Blackout, which pulled several famous sites along with it. So wise decision would be to host your instance in two different zones or at least have a mirrored EBS volume of your instance in some other zone for redundancy.