Monday, March 8, 2010

Using "quirky" updates to develop well performing set based solutions

This blog is to illustrate a scenario where 'quirky' update can be used efficiently to provide an optimized set based solution.
Recently I was asked by one of guys in a forum to provide an optimized solution for the below scenario.

Consider the case of Order and Despatch system where Items are ordered and then get despatched from despatch section. The requirement was to consolidate the despatch quantity against Orders raised. The solution involved aggregating total despatched quantity for each item code and consolidating them against outstanding orders in the sequence of ordering based on order date.
The tables involved can be given as

(OrderNo varchar(10),
OrderDate datetime,
Item_Code varchar(10),
OrderQty int,
DespatchedQty int
DespatchNo varchar(10),
Item_Code varchar(10),
Despatch_Qty int

VALUES ('abc','2010-01-23','Item1',2500,0),

VALUES ('dd1','Item1',300),

CREATE CLUSTERED INDEX IDX_OrderStock ON OrderStock(Item_Code,OrderDate,OrderNo)

The usual way to deal with this is to use two cursors to loop through each of item codes, accumulating the despatch counts and then allocating them against available orders in desired order ie. sequence of order date. The solution will be as below

SELECT * FROM OrderStock

DECLARE @Item_Code varchar(10), @OrderNo varchar(10), @OrderDate datetime,@Allocated int,@despatched int,@AllocatedSum int,@message varchar(80)

DECLARE depatch_cursor CURSOR FOR
SELECT Item_Code, SUM(Despatch_Qty)
FROM Despatch
GROUP BY Item_Code

OPEN depatch_cursor

FETCH NEXT FROM depatch_cursor
INTO @Item_Code, @despatched

SELECT @message = '----- First Cursor: ' +

PRINT @message

-- Declare an inner cursor based
-- on item code from the outer cursor.

SELECT OrderNo,OrderDate
FROM OrderStock
WHERE Item_code = @Item_Code -- Variable value from the outer cursor

OPEN order_cursor
FETCH NEXT FROM order_cursor INTO @OrderNo,@OrderDate

PRINT ' <>'


SELECT @message = ' ' + @OrderNo
PRINT @message
UPDATE OrderStock
SET @Allocated=DespatchedQty=CASE WHEN @Despatched > OrderQty THEN OrderQty ELSE @Despatched END,
@Despatched = @Despatched-@Allocated
WHERE OrderNo=@OrderNo
AND OrderDate=@OrderDate
FETCH NEXT FROM order_cursor INTO @OrderNo,@OrderDate

CLOSE order_cursor
DEALLOCATE order_cursor

-- Get the next despatch record.
FETCH NEXT FROM depatch_cursor
INTO @Item_Code, @despatched
CLOSE depatch_cursor
DEALLOCATE depatch_cursor

SELECT * FROM OrderStock

Now see the results before and after update

before the update

After the update

The problem with this approach is use of two cursors which does the row by row processing and can be quite slow.
Now lets see how we can approach the same problem using quirky update.

Quirky update
Quirky update works on basis of clustered index available on table. By using UPDATE SET construct it updates each row in table and at same time sets the next values for variables in iteration.It continues this until it updates all rows in table
See the detailed solution below

DECLARE @Item_Code varchar(10), @OrderNo varchar(10), @OrderDate datetime,@Allocated int,@despatched int,@AllocatedSum int

SELECT TOP 1 @Item_Code =Item_Code, @OrderNo= OrderNo, @OrderDate = OrderDate
FROM OrderStock
ORDER BY OrderDate

SET @Despatched=CASE WHEN COALESCE(@Allocated,0) = 0 THEN TotalDespatched ELSE TotalDespatched -@AllocatedSum END,
@Allocated=o.DespatchedQty=CASE WHEN @Despatched > OrderQty THEN OrderQty ELSE @Despatched END,
@AllocatedSum= CASE WHEN o.OrderDate=LatestOrderDate THEN 0 ELSE COALESCE(@AllocatedSum,0) + @Allocated END,
@Item_Code =Item_Code,@OrderNo= OrderNo, @OrderDate = OrderDate
FROM OrderStock o (TABLOCKX)
CROSS APPLY(SELECT SUM(Despatch_Qty) AS TotalDespatched
FROM DEspatch
WHERE Item_Code=o.Item_Code
CROSS APPLY(SELECT MAX(OrderDate) AS LatestOrderDate
FROM OrderStock
WHERE Item_Code=o.Item_Code
SELECT * FROM OrderStock

Now see the output

As we see from above we get same output. The advantage here is that we have dispensed with two cursors and replaced them with single set based update. Thus quirky updates proves to be really helpful in situations like this to provide us with set based solution.

A few things to note in above query are
1. We need a clustered index on columns based on which we want operation to take place sequentially
2. Better to use TABLOCKX to place a single exclusive lock on table
3. MAXDOP is set to 1 to avoid parallelism
4. The assignment statements to set clustered indexed columns are needed to remember current row values for proceeding with next row each time

Thus by means of applying quirky update we were able to get a more optimized solution