SQL Server supports to include non-key column as part of the non-clustered index. Having an index with the included column improves the query performance in several scenarios. This blog covers the benefits of including columns in a non-clustered index and shows how the included non-key column is searchable.
Let us consider a table “memberinfo” – This table has a clustered index based on member_no and one non-clustered index based on last name and first name columns.
Execute the below code to create the table, clustered index, and non-clustered index.
CREATE TABLE [dbo].[memberinfo]( [member_no] [int] IDENTITY(1,1) NOT NULL, [lastname] [varchar](15) NOT NULL, [firstname] [varchar](15) NOT NULL, [region_no] [int] NOT NULL ) create clustered index ix_member_no on dbo.memberinfo(member_no) drop index if exists memberinfo.nc_memberinfo_inc_ln_fn create nonclustered index nc_memberinfo_inc_ln_fn on dbo.memberinfo (lastname,firstname)
Query without included columns in a non-clustered index
The query optimizer chooses to scan the clustered index for the following query, it is decided not to use the non-clustered index because the field “region_no” is not part of the index. It would be more expensive to use the non clustered index in this scenario because only last name and member_no can be retrieved from the non-clustered index and then it has to perform a key lookup on the clustered index to fetch the region_no.
This query completed with 54 logical reads and as shown in the execution plan it scanned all the 10000 rows to fetch 385 rows.
set statistics io on select lastname, member_no, region_no from memberinfo where lastname ='anderson'
Query with an included column in a non-clustered index
Recreate the non-clustered index but now include the region_no column in index and execute the same select query.
drop index if exists memberinfo.nc_memberinfo_inc_ln_fn create nonclustered index nc_memberinfo_inc_ln_fn on memberinfo (lastname,firstname) include (region_no) set statistics io on select lastname, member_no, region_no from memberinfo where lastname ='anderson'
The query optimizer chooses to perform seek on the non-clustered index as shown in the execution plan. Because the region_no field is now part of the index, the query is considered to be a covering query. This means that the index includes all the columns referenced by the query and hence all the data retrieved from the non clustered index pages itself. This query completed with 4 logical reads and it is a significant improvement in the query performance compared to the former.
Included column in non-clustered index is searchable
Another important point to understand is that the included columns are also searchable, many think that they are not searchable. This can be explained with the same query. In the following query, the region_no is used in the WHERE clause to extract records for region_no 4. This query still utilizes the non-clustered index, it first identifies the data page containing ‘Anderson’ then it searches region _no 4 within the identified pages to filter the records. The first process referred as seek predicate and the later as a predicate.
This clearly shows that though the region_no column is not part of the index key the optimizer still used it to filter the records. This query also had 4 logical reads and the execution plan shows that it used only the non-clustered index to complete the query.
set statistics io on declare @r int = 4 select member_no, region_no from member where region_no = @r and lastname ='anderson'