Thursday, September 13, 2012

ACS With Windows Azure Webrole

I always wanted to make the ACS work in an Azure webrole. Somehow I was not able to put a sample together.

Seems like someone else has done it for us Smile. Check http://social.technet.microsoft.com/wiki/contents/articles/2590.aspx for step by step guide on how would you deploy azure ACS authenticated webrole on Azure.

~Abhishek

Friday, August 24, 2012

Performance Counters in Windows Azure Web Role

I tried adding some performance counters for Windows Azure Web Role in the OnStart method of the WebRole and interestingly it was not getting enabled. The code just executed fine with no results present in the “WADPerformanceCountersTable”.

There are a few things which need to be done correctly to ensure that this works

1) The WebRole code should be running in the elevated context. In order to make sure this happens you should add a tag in ServiceDefinition.csdef file under the WebRole which is trying to add the counter. See screenshot below

image

2) If you are trying to update an existing deployment the new performance counters may not be picked up. To ensure that they are delete the existing deployment and create a new one. It seems that that the performance counters are picked up based on an xml file under the “wad-control-container” blob. This xml file is made for each deployment and for some reason it was not getting updated for my case.

Hope this helps…

¬Abhishek

Tuesday, August 14, 2012

Block Blob Vs. Page Blob

Windows Azure storage services provide ability to store large binary files using Blob storage. So if I have a large binary file which I want to store on Azure I can use the Blob Storage.

But now comes a tricky part. There are 2 types of blob storage available on Azure viz. you can store a large file as a Block Blob or a Page Blob.

The scenarios in which to use Page Vs. Block Blob according to MSDN documentation here is that Block Blobs are to manage the upload of large files in parallel while Page Blobs are good for Random read and write operations on a Blob. This one went right over my head.

What if I have a conflicting scenario e.g. I have a huge VHD file which I want to be uploaded in parallel and once uploaded do random read and write on it as needed. So I am not really sure which one should I use in such a case. I guess I would go for the Page Blob because upload is a one time operation for VHD file and after that my hypervisor will keep modifying the file for the life of the file. So it is more important that the random read and write operations are faster then the upload speed.

The technical difference can be summarized in the below table

Factor

Block Blob

Page Blob

Max Size 200 GB 1 TB
Chunk Size Each Block can be 4 MB Each Page can be 512 Bytes
Write Operation Multiple blocks can be modified and then a Commit Operation commits the modifications in a single go The write operations happens in place and the commit is done then and there rather then waiting for a separate.

So all in all.

Parallel upload == Block Blob

Random Read Write == Page Blob

Hope this helps

¬Abhishek

Wednesday, August 1, 2012

Designing a scalable Table Storage in Windows Azure

I came across a really good post about how to design a scalable partitioning strategy for Table Storage in Windows Azure. Here’s the link to the original documentation on msdn.

My understanding

1) Azure table storage has multiple servers which serve the queries on your table

2) Table storage can use multiple servers to perform queries and operations on your table. A single server can work on multiple partitions but it is guaranteed that a single partition will always be served by a single server.

3) If a server serving many partitions becomes too hot then it offloads some partitions to a different server. During the time this offloading is taking place the client can get Server Too Busy message from the service. So a retry strategy should be applied at the client (Either a fixed backoff or preferably exponential backoff)

4) Choosing an appropriate partition key is the “key” to design a scalable storage.

a) Too many rows in a single partition can result in bottlenecks as this would mean too many rows have to be served by a single partition. A partition server has a limit of 500 entities per second. If the application hits this throughput then the table service won’t be able to serve the requests. The advantage though is that application can perform batched operations on all the entities in a table thus reducing the costs.

b) Too few rows in a single partition will make querying the table less efficient. The queries will have to hit multiple partition servers to get an aggregate result thus increasing the costs.

5) In case you decide to use unique values in partition keys azure is intelligent to make range partitions itself. The range partitions are created by azure intelligently. If you query the data from a single range partition it may increase the performance of queries but operations like insert/update/delete will definitely be impacted. e.g. if partition key is 001,002,003,…,008 then azure may create a range partition from 001-003, 004-006, 007-008. This means that the next insert will go to the third server. Which means all the inserts are handled by a single server. In order to remove such a possibility application should consider GUIDs as partition keys in such a scenario.

6) The factors which affect the choice of partitioning key are

a) Entity group transactions :- If the application needs to perform batch updates then the participating entities should reside in a single partition. e.g. in an expense system the expense items can be inserted as a batch along-with expense report entity. In such a case ExpenseReportId will be a better partitioning key to ensure that report and items lie in a single partition.

b) Number of Partitions :- Number of partitions and their size will determine how the table will perform under load. Its a very difficult to hit the sweet spot. Using multiple small partitions will generally be a good idea. Azure can easily load balance multiple partitions on multiple partition servers. This can affect the queries though.

c) Queries :- The kind of queries in the system will drive the choice of partitioning key as well. The keys on which the queries are performed are a candidate for partitioning key.

d) Storage Operations :- If the storage operations happen seldom on the table then this is not a factor. For frequent operations partitioning key will determine how the operations are served by the service. If all the operations are served by a single server (e.g. in range partition case) then this can degrade the performance.

Now applying it on a live problem would be a good idea :).

Njoi!!!

¬Abhishek

Wednesday, June 6, 2012

Source Controlling the CRM Customizations

With the introduction of Solutions and concept of layering and versioning in Dynamics CRM 2011 we have a pretty good ALM story which can help patching, upgrading the CRM systems in a deterministic way however one of the issues which was still unsolved and a pain was on how to collaborate on CRM customizations as a team.

So far all the CRM customizations are done on a server. If one wants to take advantage of source control to figure out what changes are being made to the customizations then the only option is to export the solution in a zip file and check it in. There’s no way to compare the zip files as they are binary. To work around this limitation a team can decide to unzip the file and check in the 3 huge xml files which are compressed inside the zip file. But then to diff such huge files is a problem in itself. Additionally there’s no way to build the solution without having a dependency on a CRM server.

CRM team has recently released a tool called Solution Packager which can split the customizations inside a solution zip file to granular xml files i.e. creating a xml file for each view, form, web resource etc. You can now use the source control to check in these files and its easier to spot the differences between each versions.

The same tool can then be used on the build server to package these individual files and generate a CRM solution thus removing the dependency on the CRM Server during the build process.

You can find more details about this tool here.

Happy Coding!!!

¬Abhishek

Wednesday, May 16, 2012

DOMDocument60 limitation : Unspecified Error while loading Xml

A customer was using DOMDocument60 to create an xml document and then use the same class to load the generated xml document. The document generation part of the program was working fine however when trying to load the document it was returning an unspecified error. The error code was –2147467259  and it made no sense as to why would DOMDocument60 fail to load an Xml document generated by it.

On investigating a bit further I found that the XML being generated had a very long attribute with lot of XML Entities (e.g. &) being used in its value. As I started deleting these XML Entities from the attribute value I saw that the exact position at which the error is happening started shifting.

On playing a bit more I found that the DOMDocument60 can process an attribute with maximum of 65527 XML entities in an attribute. If you have anything more then that then you start getting an Unspecified Error with error code –2147467259 . I believe it might be a buffer overflow exception in the parser. I couldn’t find any documentation around this problem or a fix for this problem however I do believe that this is an extreme usage of Xml.

As a best practice the Xml attributes should have small readable values. For anything as big as 65527 characters in the attribute value I think the Xml Element is the best choice. One should refactor the Xml to ensure that such values are part of element and not attributes.

Njoi!!!

¬Abhishek

Tuesday, May 8, 2012

Authenticating Third party Services to CRM as Web Resources

One of the capabilities of CRM is to build a Web Resource which can help you expose a custom UI on CRM Forms or Dashboards. We can also use the web resources to call third party services and expose some interesting functionality in the web resource. This web resource can then be hosted in the CRM UI and call your WCF service to show interesting data. The problem though is how do you enable authentication on your WCF service.

CRM supports multiple authentication methods. If its an on premise installation it could be using AD, ADFS etc. for authentication. It could be an IFD deployment using a 3rd party authentication provider. If its CRM online it could be using Live authentication or authenticating using the federation with Office 365.

It becomes quite a task if the 3rd party provider has to provide a web resource and ensuring that they authenticate against every possible authentication mechanism that CRM could be using. I tried to get help from a few discussion groups and is a potential solution

Use a shared key to authenticate the callers of the service. Store this key in a CRM entity instance and use it every time (in the WCF header) you call the WCF service hosted on your server. This key can be stored at the time one buys the license to use your web resource. If you plan to impersonate the user then let every user have a different key and map them in a database on the WCF service side. With impersonation enabled the user provisioning headaches for the administrator will increase but at least you can achieve it.

The drawback of the above solution is that the shared key is easily hackable since it will be available on the client side as part of the javascript. The potential solution to that could be to have a synchronous plug in on the server side which provisions a separate key by calling your server every time the web resource is being used. It will no doubt increase the round trips and hence performance to some extent. But if it needs to be really safe then that’s a solution.

Its not as elegant as we want it to be but it can definitely work. I would wait for CRM Online to support a single mode of authentication using Azure Active Directory to at least make it consistent in CRM online world. i.e. with every CRM online subscription you get a subscription of Azure Active Directory. And then your web resource can pass the token to WCF service hosted by you which can check with the Azure Active Directory for validity and do the job for you.

Hope this helps!!

¬Abhishek