Wednesday, December 7, 2011

Simulating file watcher task in ssis

A typical scenario encountered in file processing systems is implementation of file watcher module which polls a particular folder for availability of file and once available it does some processing with it. Unfortunately there's no standard task that's shipped with SSIS tool for performing this check. There's a file watcher task provided by Konesans site which you can use for implementing it. Another alternative is to implement the same functionality using script task. For the sake of those who don't have the flexibility of using third party tasks like above, i'm posting the solution using script task for simulating functions of file watcher task.
The package will look like below


As you see from above, the package will consist of a script task which will do the file watching job and a for each loop which will do the processing of files once they arrive. Now lets see the logic used inside script task.

The script tasks has a while loop which will keep on polling the directory for files and retrieves the count. The GetFiles method in Directory class present under System.IO namespace is used for polling the folder.Based on count value it sets a boolean variable. The loop has a delay logic which waits for 10 secs between consecutive checks. Only count goes over 0, it indicates the presence of files and breaks the loop after setting the boolean variable.
Once boolean variable is set, the precedence constraint looks for this variable and based on its value it executes the for each loop which will have following configuration


The for each loop will iterate through files in folder and for each file in folder it calls the file system task to move file to required folder based on file pattern. The file system task will be setup as follows

As shown above, file system moves the file iterated by the for each loop to destination folder pointed by Newfolder connection manager setup in the package. Make sure you set the OverwriteDestination property as true if filenames repeat otherwise it will throw error complaining that it cant overwrite the existing file.
The Newfolder connection manager will dynamically determine destination folder based on file type. This is achieved by setting an expression for connectionstring property of destination folder connection as follows

The logic used in above case is just checking if file is a particular type (name contains proc in example) and if yes move to one folder otherwise move to another folder. This shows how we can conditionally process file based on file types. In actual scenario, we can add more flexibility by using variables to hold the folder values instead of hardcoding values so that we can even change them at runtime through command line arguments or through configurations.
The above is simple illustration of how we can simulate the file watcher using script task. We can extend it to add more flexibility and functionality. That I leave it for the readers to pick it up. Feel free to comment in case you need any more details on this.