Wednesday, October 14, 2009

Resource Oriented Architecture - Part 5 (Connectedness)

One of the important features of web is its connectedness. i.e. almost all the web is interconnected through something called as hyperlinks.
So any resource which is addressable on web can be connected from another resource using a hyperlink. In an application based on ROA all the resources should be connected to each other. Connectedness can be achieved in an application if we choose the right representation for a resource. Xml is used to represent any entity in a RPC based architecture. Xml is a default choice because of the fact that it is structured way to represent the entity, hence can be easily understood by computer programs.
As of today the web service returns the data in the form of Xml to make it machine readable while web application represent the data in HTML format in order to make the data human readable. However what we forget is that there's another representation called XHTML which is nothing but HTML which is also a well formed Xml.
So if we can represent all the resources in XHTML computers can parse it as Xml while human can read it on browser as HTML. This also gives us a way to interconnect all the resources in a ROA based application using hyperlinks.
There's no such rule which says that resources should always use XHTML however that would be the preferable since it merges web application and web service and hence make the life easy.
We can still use Xml to represent the resources however in such a case the link to relevant resources should be embedded in the representation of the resource in order to ensure that the whole system is navigable if the user has link to one resource within a system.

This concludes the 5 part series about resource oriented architecture. The next logical step for me would be to design an application using ROA tenets, which involves identifying the right resources, their representation. I might take an enterprise application to do his exercise.
Hope to find some time to do that....

~Abhishek

Thursday, October 8, 2009

Resource Oriented Architecture - Part 4 (Uniform Interface)

The RPC style services are very good in creating and documenting the contract. However if i look at it from a high level view i am creating a new interface for every service in the world.

Imagine this i can write so many services to manage a CRM account e.g. createAccount, updateAcccount, deleteAccount, addOppurtunity, winOptions, looseOptions so on and so forth i.e. for every conceivable action i want to take on a account i can probably create a new method in the contract. This becomes so difficult to manage when orchestrating the services to come up with a business process. Because its very difficult to understand what services can be called at what stage in a business process.

Now when i give this to a programmer who's building an orchestration using these services. There's no uniformity in the interfaces which allow the programmer to come up with a pattern or a way to know what might be possible in the system. One has to be a domain expert to work with such a system.

That's why ROA proposes to have a uniform interface which should be exposed by the resources inside a system. i.e. once you know the URL of a resource it can support upto 6 methods which are nothing but http verbs. Let's have a look at them

1) OPTIONS :- This is a metadata based verb, i.e. it is supposed to tell the caller as to what all http methods does the resource support.
2) GET :- As the name implies this would return the resource content to the caller. The format of content is another story and we'd discuss in the next tenet.
3) HEAD :- This is supposed to return the http header information to the caller. e.g. when was the content last modified. This is an important verb as based on this we can take advantage of http caching infrastructure.
4) PUT :- This verb is supposed to create or update a resource. Ideally if the request is sent to a non existant URI then we are supposed to create the resource while if the resource exists on the server then it is supposed to be replaced by the new one
5) DELETE :- As the name implies, it can be used to delete or archive the resource.
6) POST :- This is one of the most open ended verb which has been left out in http and hence the most abused one by SOA. Anyways in a ROA this verb can be used either to append content to an existing resource or to create new content where the URI of the newly created resource would be defined by the server.

All in all we can conclude that a resource can support CRUD operations using PUT/POST, GET, PUT/POST, DELETE http verbs, while a resource can define the supported operations using OPTIONS http verb. HEAD is a verb which can be used to take advantage of caching mechanism.

Now once we've this kind of system in place we've to ensure that we define our system in such a way that each entity, state or a stage of business process can be represented as a resource. So in our example of CRM system we can create 3 resources i.e. Account, Oppurtunity, Option
and then both these resources can support GET, PUT, POST and DELETE operations. So createAccount becomes a PUT request to http://crmsystem/account/myAccount/1234, deleteAccount becomes a DELETE request to http://crmsystem/account/myAccount/1234,
addOppurunity becomes a PUT request to http://crmsystem/account/1234/Oppurtunity/1,
winOptions, looseOption can become a POST request to http://crmsystem/account/myAccount/1234/Options/1

The beauty of the whole system lies in the fact that each entity has a URI and each URI supports a uniform interface. So when i give a URI of an opportunity http://crmsystem/account/1234/Opportunity/1) to a programmer in my company, at the very least he knows the following
1) To get the details of the opportunity he has to hit the URI with http GET request
2) To delete the oppurtunity he has to hit the URI with http DELETE request
3) To update the oppurtunity PUT or POST should help
4) To know what is supported OPTIONS can be used.

In this world we literally can live with these verbs and most if not all programming problems can be broken into resources. The business process orchestrations can stream line themselves around these verbs and the world might become much simpler for us as programmers :).

~Abhishek

Resource Oriented Architecture - Part 3 (Addressability)

URLs is one of very simple and powerful feature of WWW. It is because of URL that everything is discoverable and locatable on the web.

e.g. a builder putting his URL on an advertisement bill board and everyone who hits that URI gets access to all the resources about a project which builder has put on the web.

If we look at a search results of google each page of the search results is a resource and each page has a unique URL.
e.g. http://www.google.com/search?q=India&start=35 will take me to a page which would have the search results for the query "India" and the page contents would start from 35th search result.

In a resource oriented world each resource inside a system would have a unique URL to locate it. e.g. in a CRM system all the accounts would have a unique URL to navigate to them. Which basically means that all the accounts are addressable. If we shoot a "http GET request" to that URL we should be able to get details about that account.

So an account with ID=1234 can be represented as http://crmsystem/account/1234
Even a collection of accounts is a resource e.g. all the accounts from Hyderabad can be resource found by sending a http get request to http://crmsystem/accounts/hyderabad
Note the subtle difference between the URLs. The first URL contains the string account while the second one contains accounts, which signifies that the first resource is just an account while the second one is a collection of accounts.

This approach is very different from classic SOA applications where we'd write 2 web service operations called getAccountbyId(string accountId) and getAccountsForRegion(string region). Both the operations will have a single URL which would be http://crmsystem/services and can be invoked by sending appropriate SOAP envelope to this URL using http POST method. Its not really intuitive for a user or a machine as to what kind of POST requests one is supposed to send on the URL.

Another important aspect to note about this tenet is that a URL is non-ambiguous and can point to one and only one resource, while a resource can be located by using 2 different URLs. e.g. an account with accountId=1234 which is a top performing account can be represented by 2 different URLs i.e. http://crmsystem/account/1234 and http://crmsystem/account/topPerformer. In this case the URL http://crmsystem/account/1234 points to a static resource and ideally won't change while http://crmsystem/account/topPerformer is a dynamic calculated resource and can point to different resources at different times.

This tenet will closely relate to the next tenet i would discuss in my next post in this series i.e. Uniform Interface.

~Abhishek

Monday, October 5, 2009

URI, URN and URL why 3 terms

I am attending a training and an interesting question came up while discussing WCF i.e. what is the difference between the terms URI, URN and URL and most of the times we do see the terms being used in various documents and we hardly care about the fact that they are more or less different acronyms and there has to be a difference.
And turns out there's a subtle difference
1) URI :- Uniform Resource Identifier
This is like a base class. i.e. if this term is used then it can mean both URN and URL. Simply signifies that both URL and URN can be used in the context.

2) URN :- Uniform Resource Name
This is a URI which can ensure the uniqueness of a name in a given context. e.g. URN "Flat no. 302". Its unique in an apartment. but apartment name is not present in the URN. We can say that a URN can be used for identification but can't be used to locate a resource without a context

3) URL :- Uniform Resource Locator
This is a URI which can be used to locate a resource. This also is a URI and can ensure uniqueness alongwith the mechanism to locate it. e.g. http://luckyabhishek.blogspot.com is a URL which is unique and also communicates the mechanism to discover/locate it.


Interesting and subtle differences pointed by Ramkumar (Our instructor)

Let me go back to training now as Ram has already figured out that i am not listening to him and doing something else. Alas .. he'd not know that i am blogging the discussion we just had ....

~Abhishek

Wednesday, September 23, 2009

Unit Tests for registry read fails in Visual Studio on 64 bit machines

My my my...
A hotch potch of Windows Server 2008 and Visual Studio 2008 on a 64 bit machine killed my 3-4 hours today...

So here's the deal
I've a method in one of my dlls which reads a certain value from a sub key of HKEY_LOCAL_MACHINE\SOFTWARE Then i wrote a unit test to test this method. Turns out it fails to read the Registry. I was amazed because registry entry did exist in the registry. After struggling for about an hour or so i decided to write a console application to test the method. And the method worked :O. So now i was in a situation where a method works from the console application while fails from a unit test.

I thought that its a permissions issue so i gave full trust to both the assemblies i.e. unit test assembly and the assembly i was testing. Even that didn't help.
Looking at the task manager of the system i saw that the unit tests run under a process call VSTestHost.exe. So next i tried to give full trust to this exe however that's not possible since it is a win32 exe.

A relook at the Task Manager showed me that the process runs under VSTestHost.exe*32 means a 32 bit process running on a 64 bit OS. Nothing suspicious about it in the first look. However if you look at the nodes below HKEY_LOCAL_MACHINE\SOFTWARE in the registry you see a node called Wow6432Node and that made me think. After some search i figured out that if you run a 32 bit process on a 64 bit machine and try to read the subkeys of HKEY_LOCAL_MACHINE\SOFTWARE The registry reads are actually directed to HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node. So i made an entry in this node and it worked.

Lesson :- Be careful while working with registries if you depend on registry for your program to work.

Happy coding :)

~Abhishek

Thursday, September 17, 2009

Making SQL Server Replication work on Network Service Account

This is one of the topics i have been working on since last few days. There's not much information available on the web on how to do that so i decided to log it in this blog.

Problem Statement :-
Setup a SQL Server replication while running the SQL Server in "Network Service" Account.

Normal Convention :-
Most local SMEs i spoke to suggested that it is not possible to setup replication between SQL Servers using the Network Service account and suggested that i use the domain account instead. However the problem with that solution is that domain account password expires in some time and then we may even get a production downtime.

Proper Solution :-
The SQL Service like any other windows service can be run using Local Service, Network Service or a Domain Account credentials. Now if you want to setup a replication Local Service account is of no use since that is an account which has no identity over the network. Domain account is not ideal since passwords have to be changed at regular intervals resulting in downtime as well as maintainance costs. So Network Service account is the ideal way to go. This account presents the machine identity over the network.
So if we want to setup replication using this account the machine identities should have access to databases. We can create a security group in active directory and add all the machines which want to talk to each other in replication should be added to this group. Then give the permission to this security group on all the databases.
This way when the replication service on one machine sends the replication related instructions to another machine it presents the machine credential which has the permissions on the database, the replication has no security related problems.
so to summarize
1) Make sure that all the SQL Service and SQL Agent Service is running using Network Service credential
2) Create a security group in the domain and add all the SQL machines which would be part of replication to this group
3) Give this security group appropriate permissions on all the databases which would be part of replication.

Happy coding...

~Abhishek

Monday, September 7, 2009

Resource Oriented Architecture - Part 2 (Statelessness)

The idea of writing a web service was that you should have a contract to work against. The policy, address etc. to invoke the implementation of this contract can be figured out at runtime. A component on remote machine will process my input and get back to me with some output based on the design time contract which i programmed against.

Now this works great once i take a look at the tools which are available to me as off today ... I've a WCF framework based on which i can define my operation contracts, message contracts, data contracts etc. in the code and the security policy, binding and address on which i want to expose my contract can be decided at the time of deployment in a configuration file. The best part is that in most cases i do not even need to care that i am writing a service which would be used from a remote client.. since even the things likes session management etc. can easily be handled by the framework. I feel like i am almost doing OOPS based programming. However if i do use the Session in my service an additional overhead has been added to the service infrastructure of maintaining this session. Which means that i put the scalability challenges in my code. Because to support Session i've to use Session Affinity or and Out of Process Session Management both of which are not very bright ideas for scaling out the application.


The World Wide Web is a very scalable architecture and one of the reasons for that is its statelessness.
Let's try to understand what statelessness means for an application....
In any application we write there are 2 states i.e. state of the client and the state of the resource which the client is asking for. When we walk about statelessness we mean that the server should only be concerned about the state of the resource it is serving to the client. It should not be concerned about the state of the client. e.g. when i type a url http://www.google.co.in/search?q=ROA&start=80 in my browser google returns me the result for my query ROA starting with result no. 80 i.e. page 9. It doesn't care whether i clicked on last 8 pages or not because that is the state of the client. The state of the resource is present on the server and is being served no matter what is the state of the client querying for it is.

While designing a resource oriented application we should be very clear in defining what is the resource that we're exposing, what is the state of the resource and what can be classified as client state. Then we should ensure that the client state is maintained by the client and sent to the server when needed in some form (most as part of URL. Wait till my post on addressability and uniform interface).

~Abhishek