Tuesday, April 22, 2008

Handling concurrency conflicts in LINQ to SQL

Apologies for the delay in getting this post out, been busy and been playing around with too many good things lately. Virtual machines, Unit Testing etc, recording some radio from the net etc. Anyways, I have been planning to write about my experiences with Conflict handling in Linq2Sql, what happens when two users want to update the infamous Product table in Northwind.

When I started playing around with it I also discovered that I needed to investigate a bit more about DataContexts, so I went reading around the web to gather info, did some experiments in my machine etc and interestingly enough I had a very related question at work where a colleague asked about the what are best practises when instantiating them. He had a static DataContext to be shared amongst his data access for a particular entity. My opinion on this is that DataContexts are cheap and light to create and that creating them for a single unit of work I do say in my GetProductByID function should be good enough. I'll say that persisiting DataContexts would create a bit of an overhead when I think that it has to keep all the mappings and ObjectTracking etc ( mind due, you can turn off ObjectTracking if you are going to do read only..). Also DCxt are not thread safe so that's another big reason not to do that. But when I started Linqing2Sql I did entertain the idea of sharing my DataContext through a singleton. Hrmm.

public voidTurnOffObjectTracking()
{
   var dal = newProductDataContext();
   // If you don't want to update stuff, don't track them.
 
dal.ObjectTrackingEnabled = true;           
   Productprod = dal.Products.SingleOrDefault(a => a.ProductID == Guid.NewGuid());
   Assert.IsNull(prod);
}

Let me not get sidetracked, it's 10:30pm and I want to go to sleep, back to our interleaving..

First of all, let's recreate the case where <  1 users are updating a record and our sys throws up a smelly. But before I do that, let's keep in mind that DataContexts are neat, I mean sometimes too neat since some stuff happens behind the scenes kind of magic, but that maybe just me :-(

For example, note how in the example below two products are obtained from the db and one of them is upated, and then without committing this change to storage, I get another instance of the same product, then I commit the original change and then compare the Products....

[TestMethod]
public void Test_DataContext_Product_Independence()
{
    using (System.Transactions.TransactionScope scope = new System.Transactions.TransactionScope())
    {
        var linq = new ProductDataContext();
        linq.Log = Console.Out;

        Guid productID = new Guid("50BD6DEB52C14D65BEC268957601EF25");
        
        // Get a Product with the ID as above
        Product product_1 = linq.Products.Single(p => p.ProductID == productID);
        
        // check that the Prod Name is as expected..
        Assert.AreEqual("Tourtière", product_1.ProductName);
        // Update the Prod Name
        product_1.ProductName = "Côte de Tourtière";
        // Let's retrieve the product again, note that the update above has not been committed to the database
        Product product_2 = linq.Products.Single(p => p.ProductID == productID);
        
        // Now commit!
        linq.SubmitChanges();

        // Note that the new name of Product2...obtained from DataContext Memory, before I committed the change! neat!
        Assert.AreEqual("Côte de Tourtière", product_2.ProductName);               
    }
}

firstTestPassed

OK, so to simulate interleaving, we'll simulate the two users trying to update a record in the db. The record look like below

ProductID / ProductName / UnitPrice

 455984DB-92DD-4BA4-87D7-91E4E2E5E00D    Camomile Tea    10.00    ....

Now our two unsuspecting users are going to try to update the Product Name for this product

[TestMethod]
public void RecreateInterleavingCondition()
{
    var user1DAL = new ProductDataContext();
    var user2DAL = new ProductDataContext();

    Guid pID = new Guid("455984db-92dd-4ba4-87d7-91e4e2e5e00d");
    Product user1Product = user1DAL.Products.Single(p => p.ProductID == pID);
    user1Product.ProductName = "Royal Camomile Tea";
    Console.WriteLine(user1Product.ProductName);

    Product user2Product = user2DAL.Products.Single(p => p.ProductID == pID);
    user2Product.ProductName = "Royal Camomile Tigris Tea";
    Console.WriteLine(user1Product.ProductName);

    user1DAL.SubmitChanges();
    Console.WriteLine(user2Product.ProductName);
    user2DAL.SubmitChanges();
}

Run the above results in: Conflict Exception!

interleavingException

User1 retrieves our Product and changes the name to "Royal Camomile Tea", not Submitting the changes, now User2 gets in there, retrieves the record changes to "Royal Camomile Tigris Tea", but User1 submits his changes and then User2 submits her changes. End result Conflict!

Now...How does LINQ to SQL handles it?

L2SQL uses what is called Optimistic concurrency is based on the un-healthy assumption that db transactions won't conflict with other transactions, that basically the chances of two users at exactly the same time wont happen that often. The alternative is pessimistic concurrency, this is where users will retrieve a record from the database and hold a lock on that record/table to prevent other pesky users to change that, problem is that is not scalable at all, systems grew larger and records where locked for ages before the sys will release the handle on it, also the emergence of the Web, the demands of stateles apps made it totally unfeasible for the pessimistic model.

LINQ to SQL uses Optimistic concurrency, what it does is that when updating a record, it checks the previous values of that record, it the values where unchanged, then is all good, otherwise it will complain and it will throw an exception and then you can handle it and choose how to handle the conflict.

In effect, if you turn Log on when Updating a record, and see the query passed onto the db you'll see that all values will be passed in the WHERE clause to compare against the columns in your table, not only the ID for the record. See below for the updates we tried to do above. MSDN has a good article on LINQ to SQL and Optimistic conc here

Royal
UPDATE [dbo].[Product]
SET [ProductName] = @p5
WHERE ([ProductID] = @p0) AND ([ProductName] = @p1) AND ([UnitPrice] = @p2) AND ([CategoryID] = @p3) AND ([CreatedDate] = @p4)
-- @p0: Input UniqueIdentifier (Size = 0; Prec = 0; Scale = 0) [455984db-92dd-4ba4-87d7-91e4e2e5e00d]
-- @p1: Input VarChar (Size = 12; Prec = 0; Scale = 0) [Camomile Tea]
-- @p2: Input Money (Size = 0; Prec = 19; Scale = 4) [10.0000]
-- @p3: Input UniqueIdentifier (Size = 0; Prec = 0; Scale = 0) [c9735b53-9c82-4f11-891a-8a53dc89565d]
-- @p4: Input DateTime (Size = 0; Prec = 0; Scale = 0) [17/04/2242 11:07:05 PM]
-- @p5: Input VarChar (Size = 5; Prec = 0; Scale = 0) [Royal]

Royal Tigris
UPDATE [dbo].[Product]
SET [ProductName] = @p5
WHERE ([ProductID] = @p0) AND ([ProductName] = @p1) AND ([UnitPrice] = @p2) AND ([CategoryID] = @p3) AND ([CreatedDate] = @p4)
-- @p0: Input UniqueIdentifier (Size = 0; Prec = 0; Scale = 0) [455984db-92dd-4ba4-87d7-91e4e2e5e00d]
-- @p1: Input VarChar (Size = 12; Prec = 0; Scale = 0) [Camomile Tea]
-- @p2: Input Money (Size = 0; Prec = 19; Scale = 4) [10.0000]
-- @p3: Input UniqueIdentifier (Size = 0; Prec = 0; Scale = 0) [c9735b53-9c82-4f11-891a-8a53dc89565d]
-- @p4: Input DateTime (Size = 0; Prec = 0; Scale = 0) [17/04/2242 11:07:05 PM]
-- @p5: Input VarChar (Size = 12; Prec = 0; Scale = 0) [Royal Tigris]

DataContext implements Optimistic by default, but you can turn it off, imagine that it passed lots of parameters, this will certainly cause performance dramas or you simply could streamline it. You can fix your mappings with the UpdateCheck attribute. More info check MSDN article here and here

Ok, so the above will throw an exception when the Update query cannot find a match, it will interpret this as "oh, someone else got in first and changed something before I did! :-_(  , I'll throw a ChangeConflictException then..."

The art of handling conflicts

First of all, wrap your potentially conflict-causing code in a try/catch block and look out for a ChangeConflictException as below..

try
{
    linq_2.SubmitChanges();
    // Here the values change!
    var result = new ProductDataContext();
    Product product13 = result.Products.Single(i => i.ProductID == pID);
    linq.SubmitChanges();    
}
catch (ChangeConflictException ex)
{ ....

Once you catch the exception, you have choices of how to handle the conflict by basically examining the guilty entities and members and choosing which values to keep, be it Database, overwriting database  values  or merging changes with the database. Let's have a look first at how to retrieve the conflicting data.

DataContext objects have a property called ChangeConflicts thats gives you a ChangeConflictCollection of ObjectChangeConflict objects that eventually you can enumerate and then drill down and get at the values that are in conflict.

catch (ChangeConflictException ex)
{
    foreach (ObjectChangeConflict occ in linq.ChangeConflicts)
    {
    ...........................................
    ...........................................
Each ObjectChangeConflict object represents the conflicted instance, in plain english it encapsulates the update attempt that caused the conflict. 
We can then iterate over the entities that "participated" in that conflict and get specific information about them. You do this by accessing the MemberConflicts collection of ObjectChangeConflict.
catch (ChangeConflictException ex)
{
    foreach (ObjectChangeConflict occ in linq.ChangeConflicts)
    {
        foreach (MemberChangeConflict mcc in occ.MemberConflicts)
        {
            Console.WriteLine("Original: " + mcc.OriginalValue);
            Console.WriteLine("Database value: " + mcc.DatabaseValue);
            Console.WriteLine("CurrentValue: " + mcc.CurrentValue);
        }

Here is where it starts to get juicy. See how above we can get the different values for a particular entity through the "stages" of the conflict. This information could be very useful, you might want to present this details to the user and let her make a decision of what to submit to the database in  a grid or something...right?

But of course, you can explicitly resolve the conflicts by once the error is detected, choosing to keep current values, database values or merge. Let's have a look at it.

Overwriting Database values

First up we are going to solve the conflict by merging the database values with User1 only since User1 got in there last, he is the last committer (is there such a work in english?), User2 won't see anything since her changes when in and the Optimistic handling didn't detect any changes. So the conflict is a conflict from the context of User1, he created the conflict, he is the last in so he has to tell the database what to do with the changes..The database if it talked, it could ask the question.."oh..you are last in...what do you we do with your changes...", That's the way I see it. :-)  Shut up and show me code please ... thanks.

[TestMethod]
public void CheckProductNamesWithTwoDataContexts()
{           
    var linq = new ProductDataContext();
    var linq_2 = new ProductDataContext();
    Guid pID  = new Guid("455984db-92dd-4ba4-87d7-91e4e2e5e00d");

    Product p1 = linq.Products.Single(p => p.ProductID == pID );
    p1.ProductName = "Red Tea";
    p1.CategoryID = new Guid("00000000-0000-0000-0000-000000000000");
    
    Product p2 = linq_2.Products.Single(p => p.ProductID == pID);
    p2.ProductName = "Green Tea";
    p2.CategoryID = new Guid("90000000-1111-1111-1111-000000000000");
    p2.UnitPrice = 10.0M;
    
    try
    {
        linq_2.SubmitChanges();
        var result = new ProductDataContext();
        linq.SubmitChanges(ConflictMode.ContinueOnConflict);  
    }
    catch (ChangeConflictException ex)
    {
        foreach (ObjectChangeConflict occ in linq.ChangeConflicts)
        {
            foreach (MemberChangeConflict mcc in occ.MemberConflicts)
            {
                Console.WriteLine("Original: " + mcc.OriginalValue);
                Console.WriteLine("Database value: " + mcc.DatabaseValue);
                Console.WriteLine("CurrentValue: " + mcc.CurrentValue);
            }  
            linq.ChangeConflicts.ResolveAll(RefreshMode.KeepCurrentValues);
        }
        linq.SubmitChanges(); 
    }
}

Quite a lengthy function yeah, but let's see what happens. User1 retrieved the Product, changed the name and the CategoryID, User2 changed the name, CategoryID and UnitPrice. Now note that I called ResolveAll of the ChangeConflict fame from our DataContext, in this case I called it linq. ( You can also call ObjectChangeConflict's Resolve function) This function takes an enum that could be KeepCurrentValues, KeepChanges and OverwriteCurrentValues. In this case we use KeepCurrentChanges, which means we'll keep our changes for the data that clashed with User2, anything else that User2 changed will be ignored and the database defaults will be reinstated. I suppose the way to interpret it is KeepCurrentValues keeps the current values in the DataContext, in this case User1. Note: Make sure once you handle the exception any way you see fit, make sure you SubmitChanges() again for that DataContext otherwise nothing will happen! :-)

Also note that you can choose also when to handle the exceptions. Say you have a lot of changes that the user made and it will be too cumbersome for them to re-enter all those values again. Since SubmitChanges accumulates all conflicts in a batch, the number of changes could be huge.. Good news is that SubmitChanges() can be overloaded to take a ConflictMode enum that will either tell the instance of DataContext to either throw the ConflictException as soon as it detects it (thus minimising the potential hassles for users) called FailOnFirstConflict, or to keep accumulating and attempting to update data and then report the whole lot (ContinueOnConflict).

  ProductName CategoryID UnitPrice
Database row (originally) Camomile Tea E1422E86-.... 0.00
User 1 Red Tea  Guid.Empty  
User 2 Green Trea Guid.NewGuid 10

Result is:

  ProductName CategoryID UnitPrice
database RedTea Guid.Empty 0.00

Note that User1 values went it , User2 data was thrown out and even though in UnitPrice there wasn't a clash between the users, the database value was the chosen one!

Retaining Database values

This is in fact the opposite as our previous example, User2 values are merged with the database and User1's values are chucked out of the window. Basically User1 says, "I'll let the other guy's values go through..I'll comeback to it again and see where my data fits with hers..". The term OverwriteCurrentValues, the way I interpret it that is read from the point of view of User1, "Overwrite my current values, therefore since User2 has committed and are in the db, keep them..".  This is achieved by using:

linq.ChangeConflicts.ResolveAll(RefreshMode.OverwriteCurrentValues);

Table please...

  ProductName CategoryID UnitPrice
Database Camomile E1422E86 0.00
User1 Red Tea Guid.Empty  
User2 Green Tea NewGuid 10.0
  ProductName CategoryID UnitPrice
  Green Tea NewID 10.0

Merging Conflicts with Database values

Using ResolveAll(RefreshMode.KeepChanges) means keeps all the changes in the DataContext and merge them with the database, overwritting the row values if necessary.

  ProductName CategoryID UnitPrice
Database Camomile E1422E86 0.00
User1 Red Tea    
User2 Green Tea NewGuid 10.0

The results are as expected, User1 keeps his Red Tea, and User2 keep hers NewGuid and UnitPrice, in effect the merging has been "between" the users only.

  ProductName CategoryID UnitPrice
  Red Tea NewID 10.0

 

Apologies for the delay in getting this one out, I have been busy and I was writing this up when we noted our beautiful and very lively ( and vicious) pet parrot passed away last Thursday night.  Very sad and it all was all of a sudden. I hope the poor fella didn't suffer and I certainly miss his singing in the mornings. Lesson learned though, birds should be free and while they are very entertaining I don't think is fair to keep them [birds] inside for our selfish gratification.

I certainly enjoyed writing this post and playing around with the DataContext, generating errors etc. There is ton of info on this in MSDN, I certainly think the guys have done a great job in documenting Conflict resolution in LINQ to SQL.

Place for you to start would be:

http://msdn2.microsoft.com/en-us/library/bb399389.aspx

1 comment: