Tuesday, January 15, 2013

DQS Knowledge Discovery Project - Issues with geometric datatypes

Recently I was working on a data cleansing project using Microsoft Data Quality Services (DQS) in SQL 2012 and was trying to create a Knowledge discovery project by setting up a domain from a SQLServer table. 
The domain was created for cleansing out the US state names. I downloaded the US census data script for sqlserver from CodePlex and applied it to the server. It had state names along with their geographic location details.

Whilst trying to map the domain to table field I got the below error

I analyzed it carefully and found out that it was because of the presence of geographic datatype column in the table. I went ahead and created a view out of the table selecting only required StateName column and tried mapping it in DQS and this time it worked! So the issue was due to the presence of geometry related datatypes in table used for domain mapping. and workaround is to create a view avoiding those columns and use it as a source for DQS.
One more thing I found was any of the BLOB data types like varchar(max) cannot be used as a mapping column inside the DQS for a domain especially when used inside SSIS package.

Hope this blog would come handy to anyone who encounters similar issues while configuring DQS. For any further clarification feel free to revert. I would be happy to answer your questions.