Saturday, June 28, 2008

Go Audio....

I found an interesting utility to convert the blogs to audio. It can be found on odiogo.com.
All one needs to do in order to make the blog an audio blog is to subscribe to the service with the url feed of the blog. Its simple to sign up as well. All one needs to do it to provide e-mail address and blog url to the site and a consent to convert all the content on the blog to voice format.
There are a couple of features missing in this utility though
1) It can't convert non english blogs to voice. (I can understand that it would be hard to globalize such a utility).
2) Ability to filter out certain posts to be converted to audio. (I miss this since i had to take a couple of non English posts out of my blog to enable my posts to be accessible through odiogo).

A good lesson i learnt while checking this utility is the way i have lost all the punctuations like ",", ".", "!" etc. in my writing skills. The audio version of my earlier posts might not be as interesting to listen since the automated reader used by this utility really depends a lot on this stuff to hook up the listener. Anyways, i would try to be more careful later on.....

~Abhishek

Tuesday, June 17, 2008

Clustering patterns

As usually is the case in most of the software problems we have identified patterns which solve the most common known problems. Similarly we have patterns for know problems in case of availability and performance as well.

There are 2 well known patterns for clustering which i would discuss in this post

1) Load Balanced Clustering

Any application server runs on a piece of software which is running on a piece of hardware. Which basically means that the server definitely has a performance threshold after which the performance would fall below expectations of the client. Even if we take the world's most advanced hardware it would have a limit to number of requests which it can server for a given piece of software. Once the number of requests increase over this number the server has to be scaled up (Not possible in this case as we already are on most advanced piece of hardware) or scaled out (Feasible option). To scale out an application running on such hardware we can add some more servers and cluster them using a load balancer.

The load balancer would act as the virtual resource to the clients and redirect the requests to appropriate physical servers lying behind it. So the network would look like the following...





The load balancer can delegate the requests to various servers based on various algorithms like
a) Round Robin :- Every one gets equal amount of requests
b) Weighted Round Robin :- A certain weightage given to servers based on hardware configuration by the administrator. The Load Balancer can decide based on these weightages to redirect the incoming requests to appropriate server
c) Least Connection :- Whichever server is serving the least connections gets the next request
d) Load Based :- Whichever server has the least load gets the next request

Normally a load balancer is intelligent enough to detect the failures of a particular server in the network and can stop sending requests to the failed node.
One of the interesting thing which goes down to the level of application design if we plan to deploy our application on a load balanced cluster is the state management. Normally applications store the Session information in the container provided objects (Session object in ASP.NET), now by default these session objects reside in the memory of the application server on which the request has been processed. In case the next request goes to a different server node in the cluster that node will have no clue about the session state stored in the memory of a different server in the cluster and important data might be lost in this case.
There can be 3 solutions to this problem
a) Use a external state server or state service to store the session object
b) Use an algorithm on the load balancer such that a request coming in the same session context always gets redirected to the same node in the cluster. (Server Affinity)
c) Ask each server to broadcast the session state in the whole . (Asynchronous Session state management ). This would be a cheap solution however with lot of network chaos inside the cluster.

Now one of the important questions which remains is to define what exactly is this load balancer. Basically it can be a piece of Software (Installed on one of the servers in the cluster) or Hardware (Special routers with intelligence for load balancing) which act as the gateway to the external world.
This kind of clustering is a good choice to meet the non functional requirements for high availability and Scalability

2) Failover Clustering

Some applications have a major requirement in terms of availability. e.g. the application can't go down even for an upgrade/patching. In such a case we can go for a configuring a failover cluster. In this pattern the idea is that a standby server is waiting to take over the primary server in case it goes down due to some reasons.
To detect that the server has gone down there can be 2 ways
a) Pull Heartbeat :- The standby server keeps checking the availability of primary server after a specified interval. In case it finds that the sever is not responding it assumes that the primary server is down and takes over the job and starts working on requests.
b) Push Heartbeat :- In this case the primary server keeps telling the standby server that its up. If the secondary server doesn't receive the heartbeat for a specified period of time it takes over the job and starts processing the requests.

In either cases before the standby server starts processing the requests it needs to synchronize with the primary server in order to be in exact same state so as to honour any open sessions/ transactions. Here are the strategies which can be used to do the same
a) Transaction Log :- Everytime the state of the primary server changes it logs the change in a transaction log. The log is synchronized with the standby server periodically and it brings itself into the same state as primary server. As soon as secondary server finds out that it has to take over it synchronizes with the latest transaction log so as to come into the same state in which primary server was before it went down. Now it is ready to take over and serve the requests....
b) Hot Standby :- In this strategy any change in the state of primary server is immediately sent to the secondary server to copy. The advantage is that as soon as the primary goes down the secondary can take over without any delay.
c) Shared Storage :- The state of both the servers is maintained on a external storage device. So it is as good as Hot Standby. It can be more or less performant based on how we synchronize the state in Hot Standby case and how responsive the is the external device to store/retrieve the state.

Another important aspect of this pattern is determining the active server. If multiple servers in a cluster assume that they are the active servers then unexpected behaviors like deadlock and data corruption may occur.

It is very important to design the cluster in such a way that we do not loose the performance. Additionally it might be a bit costly since standby server is normally not used unless there's a failure of primary one.

Now that i have summarized the 2 clustering patterns in this post the next logical step would be to setup an IIS cluster for each pattern. It should result in a new post in this series :). Here's what i plan to do ....
1) Have 2 IIS Servers on a network
2) Install a single ASP.NET Hello World applcation on it
3) Configure a Failover cluster between the 2 IIS instances and test that it works
4) Configure a software load balancing cluster and test that it works (May need a load runner kind of tool to simulate many requests).

~Abhishek



Thursday, June 12, 2008

Non functional requirements like Availibility,Scalability...

Any application being built in today's world has lot of non functional requirements. 2 of the non functional requirements are Availability and Scalability.
Availability is the ability of the sysytem to serve the requests of the users for the measurable amount of time. It is a major factor when the software is deployed in a production environment. Lots of business processes will depend on the availability of the system and if the system is unavailable then it is more often then not loss to the customer using the system. If the application is a e-biz. application then its a direct loss of money while if application is something else then its a loss in productivity and hence indirect loss of money to the customer.
In a SaaS scenario availability will be of even more concern since downtime of the service may translate into the breach of a legal contract between provider and consumer of the service and can result in bigger losses for the provider of the service.
Scalability of a system can be defined as ability of the system to be able to serve the increasing demands of the system while stealing maintaining the acceptable performance levels. Once a system is online and is a success then its but natural that the it would be used by more and more people and hence it is but natural that the system would have to handle the increasing demands.
Now a system can either be Scaled Up or Scaled Out based on the increasing demands. Scaling Ua system would normally mean increasing the hardware (memory/ more processors) of the server to enable it to perform faster. Scaling Out a system would normally mean that we increase the number of servers in the landscape and try to meet the performance standard with more servers.

I believe These 2 non functional requirements will be affecting the design application .
How is something i will try to figure out and may be post it in a different post.

Clustering is a solution to meet these non functional requirements. Basically clustering is nothing but to present a group of physical servers to the client as 1 virtual resource. The requests coming from the clients to this virtual resource can be redirected to the physical servers based on various algorithms.
A logical view of such an arrangement would look something like this..


So the above picture basically depicts that we have a cluster of server 1 and server 2 which is handling requests from client based on some kind of request routing done by the virtual resource.

There are various ways of configuring a cluster based on what kind of requirements we have. I'll look into them and post on a different post....

~Abhishek

Wednesday, June 11, 2008

Clustering !!!!

Clustering has always been a term to me rather then a topic. A few days back while having a discussion about application availability, scalability and performance i suggested clustering as a solution.
However i myself was and am not sure of what topic could this term possibly unearth for me.
So after avoiding to read on this topic for around 2 weeks now finally i decided to blog it as an action item.
I'll be reading about clustering soon and would come up with a post or 2 with the label Clustering (And added labels as required for the topics the term unearths for me)

~Abhishek