Pages

Sunday, March 27, 2011

Using FOR XML to Flatten Detail Rows

Here is a technique to flatten multiple detail rows into a single row using the FOR XML clause of the SELECT statement. FOR XML is used to convert a record set into an XML document, but, as we shall see, it has other non XML related uses as well. As an example we will take a look at the sys.indexes and sys.index_columns views to generate a list of all the indexes in your database.

We will start with a simple query against sys.indexes:

SELECT OBJECT_SCHEMA_NAME(i.object_id) + '.' + OBJECT_NAME(i.object_id) 'object_name'
, i.name 'index_name'
, i.type_desc 'index_type'
FROM sys.indexes i
WHERE i.type > 0 AND OBJECT_SCHEMA_NAME(i.object_id) != 'sys'
ORDER BY 'object_name', i.index_id


This query returns a simple list of all the indexes in your database, excluding objects in the sys schema, with one row per index. Although this is a quick and handy way to see what indexes you have, and which objects they belong to, it does not show you which columns are involved in each index. To get this detail information we need to include the sys.index_columns and sys.columns views:

SELECT OBJECT_SCHEMA_NAME(i.object_id) + '.' + OBJECT_NAME(i.object_id) 'object_name'
, i.name 'index_name'
, i.type_desc 'index_type'
, c.name 'index_key_column'
FROM sys.indexes i
JOIN sys.index_columns ic
    ON ic.object_id = i.object_id
    AND ic.index_id = i.index_id
JOIN sys.columns c
    ON c.object_id = ic.object_id
    AND c.column_id = ic.column_id
WHERE i.type > 0 AND OBJECT_SCHEMA_NAME(i.object_id) != 'sys'
ORDER BY 'object_name', i.index_id, ic.key_ordinal


Although we can now see which columns are involved in each index this result set is difficult to read because it now returns one row per index column instead of the one row per index result set we got from the first query. The reason, of course, is because there is a one-to-many (1:N) relationship between sys.indexes and sys.index_columns. This is an example of a master-detail record set where we get one row from the master side of the Join operation, and multiple rows from the detail side. When these record sets are combined using an Inner Join we end up with one row in the combined record set for each row from the detail side. But what we want to make our result set more readable is to "flatten" the detail rows that represent the index columns so that we can again have one row in the result set for each index. We could do this by converting the index column names into a comma separated list but that would normally require converting our simple query into a T-SQL script using variables and control flow expressions. Another option would be to use multiple Left Outer Joins of the sys.index_columns view for each possible column involved in the index and then concatenate the names into a single column but since SQL Server allows you to define up to 16 columns as index keys that would result in a very long and ugly (not to mention slow) query. This is where the FOR XML clause comes in. To see how this works we will first extract the detail side of the Join and use FOR XML with the PATH mode to create a simple XML document of all the detail rows:

SELECT ic.object_id, ic.index_id
, c.name 'index_column'
FROM sys.index_columns ic
JOIN sys.columns c
    ON c.object_id = ic.object_id
    AND c.column_id = ic.column_id
FOR XML PATH

Running this query in SSMS will produce a single row of XML in the Results pane. Clicking this row will open up another tab showing the results as an XML file:



Next we can strip out all the XML elements from the results by removing all the column aliases and passing an empty string ('') to the PATH modifier. Then we can concatenate a comma to the sys.columns.name column to create a comma separated list of all the index column names in the database:

SELECT c.name + ', '
FROM sys.index_columns ic
JOIN sys.columns c
    ON c.object_id = ic.object_id
    AND c.column_id = ic.column_id
FOR XML PATH('')


This gives us all our detail data in a single row of comma separated values. Now we can parse this list into smaller lists and combine them with the indexes they belong to by embedding this query as a column sub-query in the SELECT list of the master query. By sorting the sub-query on the sys.index_columns.key_ordinal column we can make sure that the index key columns appear in the same order they appear in the index definition:

SELECT OBJECT_SCHEMA_NAME(i.object_id) + '.' + OBJECT_NAME(i.object_id) 'object_name'
, i.name 'index_name'
, i.type_desc 'index_type'
, (SELECT c.name + ', '
    FROM sys.index_columns ic
    JOIN sys.columns c
        ON c.object_id = ic.object_id
        AND c.column_id = ic.column_id
    WHERE ic.object_id = i.object_id
        AND ic.index_id = i.index_id
    ORDER BY ic.key_ordinal
    FOR XML PATH('')) 'index_key_columns'
FROM sys.indexes i
WHERE i.type > 0 AND OBJECT_SCHEMA_NAME(i.object_id) != 'sys'
ORDER BY 'object_name', i.index_id


The only thing that remains now is to strip off the trailing comma in each comma separated list. To do this we concatenate the comma before the index column name and then use the SUBSTRING function to chop off the first two characters of each list to remove the leading comma and space:

SELECT OBJECT_SCHEMA_NAME(i.object_id) + '.' + OBJECT_NAME(i.object_id) 'object_name'
, i.name 'index_name'
, i.type_desc 'index_type'
, SUBSTRING((SELECT ', ' + c.name 
    FROM sys.index_columns ic
    JOIN sys.columns c
        ON c.object_id = ic.object_id
        AND c.column_id = ic.column_id
    WHERE ic.object_id = i.object_id
        AND ic.index_id = i.index_id
    ORDER BY ic.key_ordinal
    FOR XML PATH('')
    ), 3, 2048) 'index_key_columns'
FROM sys.indexes i
WHERE i.type > 0 AND OBJECT_SCHEMA_NAME(i.object_id) != 'sys'
ORDER BY 'object_name', i.index_id


And there you have it - a simple query that combines master and detail records with the detail rows flattened into a comma separated list. Click here to download an SQL file containing all the examples used in this post.