Friday, July 8, 2016

Whats New in SQL 2016 - Native JSON Function Support - Parsing JSON Document to Shred Relational Data

Previously I had blogged about the native support for JSON in SQL 2016 and the ability to generate JSON documents from relational data using FOR JSON syntax here .
This blog is next in the series on JSON support and explains about how you can use T-SQL code to parse and shred out data from existing JSON documents to load data to relational database tables.

Illustration

Consider the below JSON document 

declare @x nvarchar(max)='{
  "Item":{
      "ItemID": 101,
      "ItemDesc": "Monitor CRT 18\"",
      "Unitprice": 200
    }}'


Now lets see how we can shred out the data from this simple JSON element
We will make use OPENJSON function available in SQL 2016 for this purpose. 
The basic syntax for OPENJSON is as below

OPENJSON(jsonexpression,path)

OPENJSON accepts a JSON document expression as the first argument and parses it based on path specified within the second argument
So in the above case we will write the query as below to parse the JSON document

SELECT *
FROM OPENJSON(@x,'$.Item')

The output will look like below

key                         value                        type
----                           --------                           ---------
ItemID                 101                                 2
ItemDesc          Monitor CRT 18"  1
Unitprice         200                                 2

This will be the standard format in which output is returned. If you analyze the result you will see that it consists of three columns first one giving the key names inside JSON document, second column giving the values corresponding to the keys and third one giving the data types of the values.  The full list of type values along with the corresponding data types can be seen in the below link


If you want the get the Key names as columns along with their corresponding values  in the output you can use the optional WITH clause along with OPENJSON to transpose the result
The query in that case would look like this

SELECT *
FROM OPENJSON(@x,'$.Item')
WITH
 (
ItemID int,
ItemDesc varchar(50),
Unitprice int
)

And result would be the below

ItemID            ItemDesc                 Unitprice
------------          -----------------               ------------------
101                    Monitor CRT 18" 200


Parsing JSON Array

Now lets see what happens in the case of document with an array of JSON values

declare @x nvarchar(max)='{
  "Items": [
    {
      "ItemID": 101,
      "ItemDesc": "Monitor CRT 18\"",
      "Unitprice": 200
    },
    {
      "ItemID": 110,
      "ItemDesc": "Printer Catridge",
      "Unitprice": 112
    },
    {
      "ItemID": 112,
      "ItemDesc": "Copier Ink",
      "Unitprice": 20
    },
    {
      "ItemID": 123,
      "ItemDesc": "Wireless Mouse",
      "Unitprice": 30
    }
  ]
}'


SELECT *
FROM OPENJSON(@x,'$.Items')
WITH 
(
ItemID int,
ItemDesc varchar(100),
Unitprice int
)

The above query will give you this result




The above examples shows how you can use OPENJSON to parse JSON documents to obtain relational data.

Nested JSON Documents

Now lets see what happens in the case where there are nested JSON documents. In such case a single OPENJSON wont be sufficient to extract the whole data.
Consider the below document as an example

declare @json nvarchar(max)='{
  "Orders": [
    {
      "OrderDesc": "Order 1",
      "OrderDate": "2013-02-12T00:00:00",
      "ShippedDate": "2013-02-20T00:00:00",
      "WebOrder": false,
      "OrderItems": [
        {
          "OrderQty": 10,
          "Item": [
            {
              "ItemDesc": "Wireless Mouse",
              "Unitprice": 30,
              "SubItem": [
                {
                  "SubItemDesc": "SP17"
                }
              ]
            }
          ]
        },
        {
          "OrderQty": 8,
          "Item": [
            {
              "ItemDesc": "Copier Ink",
              "Unitprice": 20,
              "SubItem": [
                {
                  "SubItemDesc": "SP5"
                },
                {
                  "SubItemDesc": "SP12"
                }
              ]
            }
          ]
        }
      ],
      "Customer": [
        {
          "CustomerDesc": "KBC Associates"
        }
      ]
    },
    {
      "OrderDesc": "Order 2",
      "OrderDate": "2015-06-24T00:00:00",
      "ShippedDate": "2015-06-30T00:00:00",
      "WebOrder": false,
      "OrderItems": [
        {
          "OrderQty": 12,
          "Item": [
            {
              "ItemDesc": "Printer Catridge",
              "Unitprice": 112,
              "SubItem": [
                {
                  "SubItemDesc": "SP1"
                },
                {
                  "SubItemDesc": "SP3"
                }
              ]
            }
          ]
        }
      ],
      "Customer": [
        {
          "CustomerDesc": "ABC inc"
        }
      ]
    },
    {
      "OrderDesc": "Order 5",
      "OrderDate": "2016-05-17T00:00:00",
      "ShippedDate": "2016-05-22T00:00:00",
      "WebOrder": true,
      "OrderItems": [
        {
          "OrderQty": 28,
          "Item": [
            {
              "ItemDesc": "Monitor CRT 18\"",
              "Unitprice": 200
            }
          ]
        }
      ],
      "Customer": [
        {
          "CustomerDesc": "AAM & Co"
        }
      ]
    }
  ]
}'

The above is an example of  a JSON document with nested JSON elements going till 4th level. In such a case we need to use series of nested OPENJSON constructs to get data from JSON document as column data.
The query will look like this

SELECT t.OrderDesc,
t.OrderDate,
t.ShippedDate,
t.WebOrder,
oi.OrderQty,
c.CustomerDesc,
i.ItemDesc,
i.Unitprice,
si.SubItemDesc
FROM OPENJSON(@json,'$.Orders')
WITH (
OrderDesc varchar(20),
OrderDate datetime,
ShippedDate datetime,
WebOrder bit,
OrderItems nvarchar(max) AS JSON,
Customer nvarchar(max) AS JSON
)t
CROSS APPLY OPENJSON(OrderItems,'$')
WITH
(
OrderQty int,
Item nvarchar(max) AS JSON
)oi
CROSS APPLY OPENJSON(Customer,'$')
WITH
(
CustomerDesc varchar(100)
)c
CROSS APPLY OPENJSON(Item,'$')
WITH
(
ItemDesc varchar(100),
Unitprice decimal(30,2),
SubItem nvarchar(max) AS JSON
)i
OUTER APPLY OPENJSON(SubItem,'$')
WITH
(
SubItemDesc varchar(50)
)si


And you will get the output as below


If you analyze the query you can see that it uses a series of nested calls to OPENJSON to parse out the data at different levels within the JSON document.
 Each level it applies the WITH option to return the individual element values as well as returns the next level elements in JSON format itself using AS JSON option which is then parsed by next OPENJSON call by using CROSS or OUTER APPLY to pass the JSON value as a correlated parameter to the OPENJSON. This is continued till the last level to return the complete set of values from within each of JSON documents at the different levels. Depending whether you want a perfect match or not you can use CROSS or OUTER as the option for the APPLY operator. 

JSON Path vs XPath

If you compare the json path expressions used in the above examples against xpath expression you use for shredding XML data you can see that the basic format is very similar. Here we use $ to denote the root element as compared to / in case of XML. Also each child member is represented using . as  compared to / in XML. So anyone who has experience in writing XML XPath expressions in SQLServer will find it easy to follow JSON path expressions

Conclusion

The above illustrations shows how OPENJSON can be used to parse JSON and shred data into relational tables.
This method can be used to get data in JSON format from the applications and then parse and shred the data to populate the tables.