The Result Data Newsletter   
Volume 708 - August 2007   
© Copyright 2007 Result Data Consulting, Ltd.  614-505-0770  www.resultdata.com   

    Result Data Home Page  |  Newsletter Archive  |  Upcoming Events  |  Classes & Workshops  |  Request Information
Upcoming Events:  Microsoft Data Mgt & BI Lunch-n-Learn: 9/21, Microsoft SQL Server User Group: 9/21, Mid Ohio BusinessObjects User Group: 11/14

Return to Newsletter Contents...

 

Indexes

by: Charles Tournear; Senior Consultant, MCT, MCSE, MCSD, MCDBA, CRCP

An index is a balanced tree structure that aids in the retrieval of data from data pages in a table.  Understanding how indexes function in SQL Server can help in deciding which indexes to create and to make sure they are properly maintained.

The purpose of creating indexes is to improve the performance of Queries.  Many people see transactional systems with a lot of Inserts, Updates, and Deletes as not needing any indexes, but in order to update a record or delete a record it has to be found first, which requires a query.  So indexes can improve the performance of both transaction systems and analytical systems.

Indexes are composed of a key which is one or more columns used to create the balanced tree in searching for a record.  SQL Server 2005 allows for indexes to include a key used to create the balanced tree as well as other columns used for faster retrieval.  How?  By having indexes cover a query.  An index covers a query when all columns in the query can be retrieved by one or more indexes.  By including additional columns in the index other than what is being searched for, this improves the chance of an index covering a query.  Indexes generally are smaller than the original table they are based on and can thus be searched faster.  Even without a WHERE clause, an index that covers a query might be used to speed up the retrieval of the data, by having fewer pages to scan.

The basic statement for creating an index is:

CREATE INDEX indexname ON table(column1, column2, …)

With SQL Server 2005 you can use the INCLUDE key word to add additional columns other than the key.

CREATE INDEX indexname ON table(column1, column2, …) INCLUDE (column3, column4,…)

There are two types of indexes in SQL Server:  Clustered and Non-clustered.  Cluster indexes physically sort the data pages in the table itself.  Because of this there can only be one clustered index.  Non-clustered indexes create new table structures that contain pointers to the rows in the original table.  Clustered indexes are the default index type when creating Primary Keys using the graphical tools.  Non-Clustered indexes are the default for the Create Index statement unless otherwise specified.  For example CREATE CLUSTERED INDEX indexname ON table(column).

The query optimizer in SQL Server uses statistics that are associated with an index to determine whether an index will be useful in processing the query.  Statistics are the distribution of data that matches the key.  For example: for an index on last name, the statistics would hold how many names begin with A, how many with B and so on.  If 90 percent of the data were under R in the distribution table and the data you were searching for began with R then the index would not be very useful, because there would be a large number of pages to search through, though for any other letter it would be considered useful, because there would be few pages to search through.  When a compound index (multiple columns) is created statistics are created for each column independently and for all combinations of the columns.

If the statistics are out of date, then it’s possible that an index will be ignored when it could be useful, or be used when it would actually slow the query down.  Keeping statistics up to date therefore is very important.  Statistics are automatically recreated when an index is created or rebuilt.

Many people, when designing a database, created indexes on the tables before any data had been added.  They assumed because Auto Create Statistics and Auto Update Statistics were turned on under the Database Options properties that everything was set to maintain statistics.

Pic 1

The Auto Create Statistics and the Auto Update Statistics only update when more than 10% of the data has changed from the last time update occurred.  If indexes are created on tables before data is added then update statistics will never run.  The original amount of data to begin with was 0 since there is no last date that update statistics occurred.  Adding 10000 rows of new data, then makes it 0 divided by 10000 or 0% change so update statistics won’t run.  And since there is no last date the original amount of 0 will continue to be used.

So if you create indexes on tables before adding data you need to manually run an update statistics statement to initialize the process.  For analytical systems you may want to rebuild indexes or update statistics depending on the size of the tables after each import of data.

In the example below the database consistency checker function DBCC SHOW_STATISTICS is used to get information about the statistics for a specific index.  Note the updated column showing the last time the statistics were updated for this index.  The second statement is an example how the statistics for all indexes on the Person.address table could be rebuilt.

Show statistics

An additional item that affects whether an index will be used is the query itself.  You should avoid using items that will prevent the optimizer from being able to select an index.  Avoid changes to datatypes.  For example:  WHERE convert(varchar(20), birthdate, 101) = @datevariable.  Formulas around a column will prevent an index being used on that column; consider changing the type of the @datevariable to a datetime data type instead of converting the column.  Any formula using a column other than aggregate functions will prevent indexes being selected on that column.  For example, the common use of LTRIM(RTRIM(lastname)) will prevent an index on last name from being used.  If you simply cannot avoid using formulas on a column and that formula is used a lot, consider creating a calculated column and creating the index on the calculated column.

In summary, indexes can be very useful for finding records through queries as long as they are properly maintained.  Be sure to periodically rebuild the indexes and update the statistics so that the query optimizer can make an efficient decision about how to run your queries.

Go to Top  |  Return to Newsletter Contents

 

The Result Data Newsletter is published approximately once a month to share the latest information on business intelligence, data management and CRM. There should be a link below to allow you to change or remove yourself from our list. We take your requests very seriously. If you have any difficulty please contact us at 614-505-0770 and we will make sure that your request is handled properly. This is not intended to be an unsolicited message and you can reach us in person if needed.

© Copyright 2007 Result Data Consulting, Ltd. - All Rights Reserved
All trademarks and copyrights are the property of their respective owners. This information is provided without warranty.
Announcements
Summer Training Special
Beat high gas prices with Result Data Training.  Schedule and attend any public training class now through September and receive a FREE fuel card for up to $200 (restrictions apply).  Call 614-505-0770 for further details and restrictions.
Next Microsoft Data Management & BI Lunch-n-Learn
The next free Microsoft BI seminar is on Sept. 21, 2007 and will be a free lunch-n-learn.  Call 614-505-0770 or click here to reserve your seat.
Next Microsoft SQL Server User Group Meeting
The next Microsoft SQL Server User Group meeting is on Sept. 21, 2007.  Call 614-505-0770 or click here to reserve your seat.
BI Designed and Priced for Mid-Sized Companies
Join us on Sept. 26 for a special presentation just for mid-sized businesses and find out how companies like your are improving their performance using business intelligence (BI).  Call 614-505-0770 or click here to reserve your seat.
New MS-Reporting Services Workshop
Check out our new SQL Server Reporting Services 1 day workshop. Get a leg up on SSRS and learn how create and deploy reports. Our first workshop is on Sept. 28 and will sell out quickly.  Call 614-505-0770 or click here to reserve your seat.
Next MOBOUG Meeting
The next Mid-Ohio Business Objects User Group (MOBOUG) meeting is November 14, 2007.  Call 614-505-0770 or click here to reserve your seat.
Looking for a Few Good Men and Women
Join our award winning team of Business Intelligence consultants and .Net Software developers.
Send your resume and salary requirements to:
jobs@resultdata.com