Sunday, May 5, 2013

Checking database access rights for a user in a sqlserver instance

There was a recent requirement in one of my projects to check the access for particular user in all the databases in a server. Thought of sharing the script used through the blog for quick reference.
 The solution makes use of HAS_DBACCESS function available in SQL Server. The script would be as below


SELECT name, CASE HAS_DBACCESS(name) WHEN 1 THEN 'Has Access' ELSE 'No Access' END AS Access
FROM sys.databases

The catalog view sys.databases will have a record for each database within the instance. The above script will give as the output a list of databases with their access information based on result HAS_DBACCESS returns. It will return 1 for all databases accessible to current user and 0 otherwise.
The system databases can be filtered by adding a condition such as below to get only the user databases information.


SELECT name, CASE HAS_DBACCESS(name) WHEN 1 THEN 'Has Access' ELSE 'No Access' END AS Access
FROM sys.databases
WHERE  name NOT IN ('master','tempdb','model','msdb')
AND name NOT LIKE 'ReportServer%'


Sunday, April 28, 2013

Programaticaly executing packages in SSIS 2012 project deployment model


The usual way to execute SSIS packages programmatically was by invoking Dtexec utility until SQL 2012. With the arrival of project deployment model in SSIS 2012 we can now rely upon catalog view and system stored procedures for executing SSIS packages deployed to SSISCatalog relational DB. This post discusses the approach that can be followed for executing package deployed in project model  programmatically and then checking the execution status for setting correct execution sequence thereafter.

The scenario was like this. There was a data feed coming from third party which had to be first uploaded to a table in the database. Then based on the outcome of this data load there were a series of data loads which included some lookup logic to the first table and based on that do some data loading into the warehouse tables. The initial data load was not automatic but had to be manually driven by mean of upload screen.

The approach followed is given below.
It consists of creating an execution for the main package and then starting it. Once started, a logic was written to monitor the execution result. Once the result is returned ie package execution finished, result is checked whether it was success. If yes, the execution will proceed with rest of the steps otherwise it will stop the procedure raising an error.
The stub for the solution is given below
The package should be deployed to SSISDB catalog as shown below

Then you can use code as given below



DECLARE @exec_id BIGINT

    EXEC [SSISDB].[catalog].[create_execution] 
        @package_name=N'Package.dtsx',     --SSIS package name TABLE:(SELECT * FROM [SSISDB].internal.packages)
        @folder_name=N'MyTestDeploy', --Folder were the package lives TABLE:(SELECT * FROM [SSISDB].internal.folders)
        @project_name=N'Integration Services Project2',--Project name were SSIS package lives TABLE:(SELECT * FROM [SSISDB].internal.projects)
        @use32bitruntime=TRUE, 
        @reference_id=NULL,             --Environment reference, if null then no environment configuration is applied.
        @execution_id=@exec_id OUTPUT   --The paramter is outputed and contains the execution_id of your SSIS execution context.

EXEC [SSISDB].[catalog].[start_execution] @exec_id

SELECT @exec_id


DECLARE @exec_status int,@end_time datetimeoffset

WHILE @end_time IS NULL
BEGIN
SELECT @end_time = end_time,@exec_status=status
FROM [SSISDB].internal.execution_info
WHERE execution_id = @exec_id
IF @end_time IS NOT NULL
BEGIN
 IF @exec_status != 7
PRINT ' Package Failure'
ELSE 
PRINT 'Success'
         -- Other tasks to follow

    END
WAITFOR DELAY '00:00:15'

SELECT 'Checking loop'
END

This logic can be used for checking the execution status and then determining the sequence of operation for the rest of the steps.

Thursday, April 11, 2013

Demystifying LAST_VALUE function in SQL 2012

The impetus for writing this blog is the ambiguity existing regarding the documentation of LAST_VALUE() function in books online/ MSDN.
Recently I was playing around with new analytic functions of SQL 2012. Was trying to implement the LAST_VALUE functionality when I noticed the below discrepancy.
Consider the below table as an example
I've just shown first and last few records as table has about 1000 rows

Now I'm using below query to get previous,next,first and last prices for each product


SELECT OrderDesc,OrderDate,ProductName,Qty,ItemPrice,
LAG(ItemPrice,1) OVER (PARTITION BY ProductName ORDER BY OrderDate) AS PrevPrice1,
LEAD(ItemPrice,1) OVER (PARTITION BY ProductName ORDER BY OrderDate) AS NextPrice1,
FIRST_VALUE(ItemPrice) OVER (PARTITION BY ProductName ORDER BY OrderDate) AS StartingPrice,
LAST_VALUE(ItemPrice) OVER (PARTITION BY ProductName ORDER BY OrderDate) AS LastPrice
FROM TestTable
WHERE ProductName = 'Product10'
ORDER BY OrderDate

The output for the query is as follows (showing first few rows)

On analyzing the output we can see that FIRST_VALUE() function returns the first value of ItemPrice field in the partition (in this case based on ProductName) which is 201300 whereas for LAST_VALUE result we get the current ItemPrice value repeated for each of the row. This surprised me and I referred to MSDN documentation on the same which is below link
http://msdn.microsoft.com/en-IN/library/hh231517.aspx
 But I couldn't find out a satisfactory explanation for above behavior. I looked for related links and came across the below link which explains details on OVER clause
http://msdn.microsoft.com/en-IN/library/ms189461.aspx

Particularly have a look at below section
If ROWS/RANGE is not specified but ORDER BY is specified, RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as default for window frame. This applies only to functions that have can accept optional ROWS/RANGE specification. For example, ranking functions cannot accept ROWS/RANGE, therefore this window frame is not applied even though ORDER BY is present and ROWS/RANGE is not.

This had the bell ringing for me.It turned out that I ignored the ROWS/RANGE clause in above LAST_VALUE function so it defaulted to the window RANGE UNBOUNDED PRECEDING AND CURRENT ROW. This means for every row in the resultset the window considered was from first value of window (201300 ) till the current row under consideration. This was the reason for returning the same value as last value for each row. FIRST_VALUE still worked fine as window started with the initial value of the partition.
Now that we got the interpretation we can think what change we need to do to the above query to get our intended result. The solution is to consider the entire partition as the window for getting correct last value.
So rewriting the query we will get


SELECT OrderDesc,OrderDate,ProductName,Qty,ItemPrice,
LAG(ItemPrice,1) OVER (PARTITION BY ProductName ORDER BY OrderDate) AS PrevPrice1,
LEAD(ItemPrice,1) OVER (PARTITION BY ProductName ORDER BY OrderDate) AS NextPrice1,
FIRST_VALUE(ItemPrice) OVER (PARTITION BY ProductName ORDER BY OrderDate ROWS between unbounded preceding and unbounded following) AS StartingPrice,
LAST_VALUE(ItemPrice) OVER (PARTITION BY ProductName ORDER BY OrderDate ROWS between unbounded preceding and unbounded following) AS LastPrice
FROM TestTable
WHERE ProductName = 'Product10'
ORDER BY OrderDate

Lets now analyze the results of modified query



Comparing this to the initial table result we can see that it now picks last value as 89000 within product partition for Product 10 which is correct.
Thought of putting this in a blog as I couldn't see any documentation which clearly explains the reason for this observed behavior.
Feel free to comment if you need any more details on this.
The table creation scripts for above used table can be found in the below link
https://docs.google.com/file/d/0B4ZDNhljf8tQTVNRemFoZE5aSDA/edit?usp=sharing

Wednesday, April 10, 2013

Implementing Checkpoints in SSIS

Checkpoints indicate a consistent state of a database. This is essential for database to perform actions like writing to log file, truncating log etc.
The concept of checkpoints have also been implemented in SSIS packages. Checkpoints will indicate a saved consistent state in package sequence which can be utilized in quite a few scenarios. This blog discusses one such practical scenario.
Consider the case where you've a SSIS package which involves a step which is resource intensive. This would obviously be a time consuming step. Imagine an intermediate failure in package after this intensive step.Rerunning the step would require again spending same amount of time which may not always be desirable.In such cases adding checkpoint support can always avoid rerunning these intensive steps again due to a failure in one of the succeeding steps. Checkpoint will ensure the packages last consistent state (last task which was success) is stored and any further execution will only start from last failed step.
 The example package for the illustration will be as follows

The package consists of two data flow tasks followed by execute sql task. The data flow tasks look like below
The first data flow task as shown above consists of flat file source to transfer data from text file to sql server table using OLEDB destination.
The second data flow task will  get product details and then retrieves associated order details using a lookup task and populated result to final table


The flat file will have data as follows


The product table will have following entries

As you see from above there are few products without any associated orders like ones with ID value 1 and 6. To mimic an error situation i've set the lookup nomatch property as below for the lookup task in the second data flow

This will ensure the task will fail when it doesn't find matching records for products with id values 1 and 6. This can be utilized for illustrating effect of checkpoint.
For enabling checkpoint support in the package we need to set following properties



  • CheckpointFileName should be pointed to valid filepath. This will contain checkpoint information as a xml file
  • CheckpointUsage should be set to IfExists
  • SaveCheckpoints to true

For all the other tasks involved FailParentOnFailure property should be set to true and for the outer container FailPackageOnFailure should be set to true.
Now lets try running the package. The package progress will be as follows
And we will get error information as expected telling that lookup failed
Now lets go ahead and fix the error by changing lookup property to redirect row instead of fail component

Lets now try rerunning the package and we will be amazed to see the result which will be as below

This was not something we expected as it completely ignored the second data flow task which failed in previous run. The reason lies in the previous execution result which is as below
As you see checkpoint has recorded the task as completed which is why it was ignored in the rerun. This is because the failure happened midway in the data flow task.This can be handled by wrapping the data flow inside a transaction. This is done by setting value as Required for TransactionOption property of data flow task. This will make sure data flow task will always initiate a transaction and will record result as failure unless it goes to completion successfully till the last step within it.
One more thing to keep in mind is to start the distributed transaction coordinator service if you want to use transaction from the local services of your machine (services.msc from command prompt).

After wrapping it in transaction and getting the earlier failure if you rerun you'll get below

As you see from above, it now resumes correctly from the last failed step and goes to completion. The reason is now data flow task started a brand new transaction each time and any intermediate failure will cause entire data flow to be failed and setting status accordingly in checkpoint file.
This is something you can apply in similar scenarios where you don't want to rerun some intensive high volume steps due to a failure happening on a succeeding task.
So to summarize the below are things you need to keep in mind for implementing checkpoints in SSIS

  • Set the Checkpoint related properties for package as outlined above
  • Set FailPackageOnFailure property for individual tasks and FailPackageOnFailure for containers
  • Set TransactionOption property to Required for all data flow tasks and start DTC service
Hope this post would help you to successfully implement checkpoints in ssis package.
As always do let me know of any feedback or any clarification you may have on the above.

Friday, March 29, 2013

Error Handling in SSIS loops

This is an extension to the below post
http://visakhm.blogspot.in/2012/05/package-to-implement-daily-processing.html
where I explained the usage of looping logic in SSIS to do automated file processing on daily basis. One of the common issues we face in above scenario is ability to handle error files within the loop. The default behavior is package failing with error when it encounters a file with errors. But in reality we might want to continue processing rest of the loop after recording the error file details. This blog post will give you an approach which can be used to handle such scenarios.
The package would look like below

The steps involved are

  • Truncate step to truncate the processing tables beforehand for each run
  • For each loop to loop through files for processing and do following activities:-                          Capture audit start info,do actual data transfer and capture audit end info

The source files are as follows


I've added files with error information (wrong metadata) to mimic the scenario. I've also added a OnError event handler in package for data flow task to capture error file details as follows


The audit information would be captured via execute sql tasks to a table called FileProcessControl.
The flat file connection has metadata setup as follows

Any file having metadata different from above will go to error processing path.
The default package behavior of failing on error file processing can be overridden as follows
In the OnError Event handler for data flow task select variables and click on system variables. Then change value for propagate variable from True to False as shown below

This will ensure the error doesn't get propagate to cause package failure. On executing package the errors will not cause loop to break.

If we check the audit table we can see that it has captured error details and continued with next file

The result shows package executing all files regardless of intermediate error file and also captured status in audit table.
Hope this logic can be used in similar scenarios for error handling within loops in SSIS .
There are also other properties you can take advantage of for error processing like ForcedexecutionResult, ForcedExecutionValue etc depending on your scenario.
Feel to free to revert if you need any more info on this. In case you need the full package drop a comment and I'll share it for you.