Monday, October 28, 2013

Generate Format Files Based on Table Metadata

Recently I had blogged about advantages of using format files in bcp to generate custom formatted output files.
I got lot of responses on this post and many of them were asking whether there's any easy way to generate a format file for exportin a table to file. I'm explaining a method using bcp command to generate a format based on table metadata which can be utilized for the data export through this blog.
Consider the following table for our illustration

The format file for this table can be generated using bcp command. bcp has a lesser known mode called format which can be used to generate format file. The command will look like below

bcp DBName.dbo.Customers format null -c -t -f  Fullpath\filename.fmt -S "ServerName" -T

Assuming windows authentication being used. In case of sql authentication use -U -P switches to pass username and password as

bcp DBName.dbo.Customers format null -c -t -f  Fullpath\filename.fmt -S "ServerName" -U Username -P Password

Notice the null after format. This is required for the format mode in bcp. Once executed this command will generate format file for the table metadata in the specified format and the file will look like below.

As you see from above the file will contain details on all the columns contained in table (26 columns) and will specify the information in following order
1. Ordinal Position 
2. Data type
3. Prefix Length
4. Field Data Length
5. Row and Column terminators
6. Mapping  Column Position
7. Column Name
8. Collation (in case of character data)
One thing you may notice in the above file is that data type for all fields are interpreted as SQLCHAR. This is because of usage of -c switch in bcp which will consider all as character data. If you want to preserve native data types use -n switch instead and you'll get output like below

This clearly shows columns being interpreted using native data types like SQLNCHAR, SQLIMAGE, SQLDATETIME, SQLBIT etc. Also one other thing you may notice is that it will put prefix length based on the datatype (like 2 for SQLNCHAR,4 for SQLIMAGE etc) and ignore terminator information while in native mode.
In SQLServer format files can be generated in two formats - XML and Non XML. 
So far we were generating format files in Non XML format (extension .fmt). Making a small change in bcp command we can also generate format files in xml format. The command in that case would look like

bcp DBName.dbo.Customers format null -c -f \filename.xml -S "ServerName" -x -U Username -P Password 

The only changes we do are specifying filetype as .xml extension and also using -x switch. The file generated in that case would be as  below

It will have two sections one giving details on Field Length,Terminator etc and second part giving Data type, Name etc
Once you generate the format file based on the table metadata then this can be very easily tweaked as per your requirement to suit your incoming data format as shown in the earlier post. 
For example consider the case where you've a field with data enclosed within " character you can get data exported correctly by using the terminator as "\t\"" for previous column and using terminator as "\"\t" for the current column assuming tab being the default delimiter used.
Hope this helps to understand how bcp can be utilized to generate format files which can be utilized in data import/export operations to get custom formats.