Anatomy of a Select Statement Part 3

Welcome to part 3 of this series.  The first part covered the order of execution of the commands, while the second session went in depth about the SELECT command. This post will cover the FROM clause.  Most people would think that you will list tables in the FROM clause.  Yes, that is correct, however you can do so much more.

What can be listed in the FROM

This first step is to list what can you use in the FROM clause. The list includes the below list.

  • Table
  • View
  • Temporary Table
  • Table Variable
  • Derived table
  • Table valued functions

We will cover each of these.

Table and View

Tables and views are probably the most common objects used in the FROM, although I really don’t have any exact statistics on that. Below is an example is a simple query.

SELECT TOP (1000) [ProductID]
,[Name]
,[ProductNumber]
,[MakeFlag]
,[FinishedGoodsFlag]
,[Color]
,[SafetyStockLevel]
,[ReorderPoint]
,[StandardCost]
,[ListPrice]
FROM [AdventureWorks2014].[Production].[Product]

In the FROM clause you will see the table or view.  Notice that not only is the table name listed, but the database and schema are also specified.  Those two are not require, but I might not be a bad idea to start listing the schema as well as the table name.

By looking at this statement, we can only speculate that the object listed is a table.  If you use a view instead, they will look that same.  Really the only way to tell if the object is a table or view is to simply know what type of object it is.  You can look in the objects explorer in SSMS and look for the object.  Sometimes using the naming convention can also be used to make this determination.

Temporary Table

Temporary tables are created in the TempDB system database.  A temporary table can be identified by the start of the table name.  The name will start with either # or ##.

Phil Factor on Red-Gate’s web site gives a great description of when we should use temporary tables. Here is the link: RedGate

They are used most often to provide workspace for the intermediate results when processing data within a batch or procedure.

When a temporary table starts with # it is a “local” table.  This means that the only process that can see and use the temporary table is the one that created it. While global temporary tables will start with ##, they are not as secure because they can be used by other processes.

Since local temporary tables are used only by the process that created it, it will automatically be dropped when the connect that created it ends.  Global temporary tables could persist a bit longer because it will remain until all processe that are using it are terminated.  This post isn’t an in depth discuss about temporary tables, you can to here to get more information: TempTables.

Prior to using a temporary table, it must be created.  You can use the CREATE TABLE statement or use SELECT…INTO.

–Create the table
CREATE TABLE #test
(ProductName VARCHAR(50)
, Price MONEY)

–Populate the table
INSERT #test
SELECT Name
, ListPrice
FROM Production.Product

–Use the table
SELECT *
FROM #test

In the code above you will see three distinct sections.  The first is to create the temporary table, populating the table is the second and the third and last step is the use the table.

If you attempt to query the table and it is out of scope you will get the following error.

You receive this error because with a local temporary table, it is only available to the session that created it.  That is not the case with a global temporary table, one that starts with ## rather than # for a local table.

However if you change from a local to a global temporary table, you will not receive the above error.  You will actually get the results you are hoping for. If you look at the image below, you will see that the table was create with SPID 51 and used with SPID 60.  Also note that even though it is a valid table, SSMS is still indicating that it is an invalid object in SPID 60.  Of course that is not the case, it works just fine.

As you can see, temporary tables are used in pretty much the same way as user tables in a database. You just need to be aware of the scope of the table, is it local or global.

Table Variable

A table variable follows the same steps for creation and use.  Instead of using the CREATE TABLE statement, when using a table variable, you use the DECLARE statement to declare the variable as a TABLE datatype.

The scope of a table variable is the batch.  If you create a temporary table you can use it for the duration of the session.  While the duration of a table variable is the batch itself.

There are some differences on how SQL Server processes temporary tables and table variables.  That is not the purpose of this post.  You can go here to find more information regarding the differences.

Derived table

When I teach the Microsoft SQL Server certification course on TSQL, a derived table is in the same chapter as subqueries. A derived table is really not much more than a SELECT statement in the FROM clause instead of a table name.  As you can see from the code below, the derived table is highlighted.  The derived table has an alias of “v” and is enclosed in parenthesis.

Table Valued Function

Like temporary tables and table variables, before using a table valued function(TVF) the function must be created.

The code below shows how to create the function.  This function returns a table. This is from Microsoft’s web site. This will work on the AdventureWorks2014 database.

CREATE FUNCTION ProductsCostingMoreThan(@cost money)
RETURNS TABLE 
AS 
RETURN 
SELECT ProductID, ListPrice
FROM Production.Product
WHERE ListPrice > @cost

Once the function is created you can see in SSMS under Programmability.

One of the advantages of a TVF is that they can accept parameters.  In the code above, the parameter is for the list price. In the code below you will that the TVF is used just a user table might be.  Also notice the parameter being passed into the function.

In this case, we are looking for any products that cost more than $80.50 and then joining it on the Product table.

Each of these objects has performance advantages and disadvantages.  That is not the purpose of this post, however it is something that must be considered when determining the appropriate solution.

In the Part 4 on this topic, I will cover JOINS.

Thank for visiting my blog.  I hope you learned something.

 

 

XEvent Profiler

XEvent Profiler is not a SQL Server feature, it is a feature of SQL Server Management Studio(SSMS). It was first introduced in SSMS v17.3 and the XEvent Profiler sits on top of Extended Events technology.  Per Microsoft, XEvents “enables quick access to a live streaming view of diagnostics events on the SQL server”.

You can find XEvent Profiler at the bottom of the server tree in SSMS, just under SQL Server Agent. There are two sessions that come with XEvents, Standard and TSQL.

 

Starting and Stopping the Sessions

When you first Launch the session and the Extended Event sessions does not exist, SQL Server will create the session, however it not be started.  To start it you need to right click on the session in XEvent Profiler and click “Launch Session”.  If the Extended Events session already exists, SQL Server will start the session.

When you are done doing what you need to do, simple follow the same steps and click “Stop Session”.

Standard Session

When you enable this session it creates an Extended Events session named QuickSessionStandard.  This sessions does not have a target defined and has a Maximum Dispatch Latency of 5 seconds.  If you create an Extended Events session the default Maximum Dispatch Latency is 30 seconds, significantly higher than the Standard session. This session collects a lot of performance information.  However, it also collects login and logout information. Below you will find the list of events that are being collected.

One thing you might notice about the above image is that there are number Predicates defined.  These are all the same, they exclude system processes.

When you launch the session, SSMS will allow you to watch live events.  The screenshot below is what you will see when viewing live events.  Among the information collected, you will find Database Name, Batch SQL, Client Application, User account and the query_hash.  This view can be modified to meet your needs. The changes to the view will remain when you close the session.

TSQL Session

The second session available is the TSQL session.  The Extended Event session is named QuickSessionTSQL.  As the name implies, this session collects information about the execution of TSQL statements. The events are below.  You notice a few things about these events, like the QuickSessionStandard session, the TSQL session does not have a designated target.

  1. Many are also in the QuickSessionStandard session including:
    • Existing_connection
    • Login
    • Logout
    • sql_batch_starting
  2. There are fewer events than in the QuickSessionStandard session
  3. The QuickSessionStandard session has the sql_batch_completed while this does not

Like the QuickSessionStandard session, the events in this session also have a few predicates applied. All of which are to exclude system sessions from the live data.

Viewing Session Data

When you launch either of these sessions, SSMS will also launch the screen that will allow you to view live data, see below.  Something to keep in mind, when you close this window, it will stop the session.

Modifying the Sessions

Although you can modify the view, it may not be the best idea to modify the definition of these sessions. If you find a need to modify these sessions, it might be better to either modify another user defined Extended Event session or create a new session.

Both of these sessions are intended to provide a quick way to collect and view some basic performance information about your SQL Server.  For those of you that like to use Profiler, this could be a good alternative, even though it is not as powerful or very flexible.  Since neither of these sessions has a defined target, the data is not stored.  Once you disable the session the data is gone.  If you really need this data for review at a later day, you can export it.  The export menu can be found under the Extended Events Menu item.

Thank for stopping by!!!

Additional Information

Microsoft

MSSQLTIPS

SQL Server Statistics Information – How do I Get It

 

When doing maintenance on a SQL Server there are number of things that must be completed.  These include backups, index optimization, DBCC Checkdb and update statistics.  This post really isn’t about how to update statistics, it is more about how do I gather information about statistics.  This information is important when troubleshooting a performance issue as well as a way to check to see if your statistics update strategy is effective in updating the statistics properly.  Remember, if statistics are not updated properly or not created at all, this could lead to performance issues.

There are a few ways to gather statistics information, but before getting into those, let’s go over that information about statistics is important.  This is not all the items, but they are the key ones.

Important Statistics Information

  1. What statistics exist
  2. Date they were last updated
  3. Is there a filter?
  4. Number of rows and the number of rows used for the sample
  5. How many modifications since the last statistics update

These are all important in one way or another.  Here is why each of these is important to look at.

What Statistics Exist – This is important to know because sometimes what statistics we think exist may not actually exist.  If the “Auto Create Statistics” database setting is turned off, then of course statistics would not be created other than manually or by creating an index.

Date They were last updates – This will help us understand if the statistics are being updated appropriately and as expected.

Is there a filter? – This will help us understand if the statistics are for the all the rows of the table or just a subset.

Number of Rows and the number of Rows used for the Sample – We need to review this to see if SQL Server is using an appropriate number of rows to update the statistics.

Number of Modifications since the last Statistics update – Understanding how frequently the column is modified will help us determine if the statistics update is happening appropriately.  It is important to keep this in perspective.  You can’t just look at this number, it is also important to look at the date last updated and the number of rows.  If there are a high number of modifications but the statistics were last updated six months ago, there may not be an issue.  Although I might want to look to see why the statistics haven’t been updated for 6 months.

How do I get this information

There are a couple of different ways to get this information.

SQL Server Management Studio

You can simply right click on the statistics object you want to review and click Properties. This will open up the properties window.

On the first screen you will see this.  You will be able to identify the date of the last update. It is highlighted below.

The number of rows and rows used in the sample can be found in the “Details” tab, while the filter information can be found by clicking the “Filter” tab on the left.

What cannot be found by using the Properties window is the number of modifications since the last statistics update.  This is kind of limiting.

There are two ways to get all the information above.  One is to use DBCC SHOW_STATISTICS the other is to query the SYS.STATS table.

DBCC SHOW_STATISTICS

This statement will return all the above information in three data sets, the header, the density vector and the histogram.  The syntax is below.  We are not going to get into all the syntax, just the basics.  If you want more information you can go here.

This is the basic syntax.  The statement expects and table or view name and a statistics object name.  The output is below.

DBCC SHOW_STATISTICS(‘person.person’, [IX_Person_LastName_FirstName_MiddleName])

The three sessions, Header, Density Vector and Histogram are identified by the arrows. While the green box labels that date last updated, red is the number of rows and rows used as a sample. The final box is the blue box on the left, this identifies if there is a filer or not.  Notice that to the left of that there is a column named “Filter Expression”.  This will identify the expression used for the filter if one exists.

You can only return one of the sections by run one of the code snippets below.

DBCC SHOW_STATISTICS(‘production.product’,ReorderPoint)WITH STAT_HEADER

DBCC SHOW_STATISTICS(‘production.product’,ReorderPoint)WITH DENSITY_VECTOR

DBCC SHOW_STATISTICS(‘person.person’,[IX_Person_LastName_FirstName_MiddleName]) WITH HISTOGRAM

SYS.Stats

This is from Microsoft: “Contains a row for each statistics object that exists for the tables, indexes, and indexed views in the database“. In reality this really needs to be used with a few other statistics related objects.  These related objects are sys.stats_columns and sys.dm_db_stats_properties. Below is the code that can be used to gather all the information and then some.  This code can be found here.

I just took the code that was provided by Microsoft and added the sysobjects table to allow for the data set to include the column name.

Notice that the dm_db_stats_properties obect is a function that has two parameters, objectID and StatsID.

Below are the results of the above statement.  Notice that it includes all the items mentioned at the beginning of this post.

 

All the above methods return much of the information to review statistics.  What I like about the last option, I can see all the statistics in one data set.

Here are a couple of good links to learn more about SQL Server Statistics.

Microsoft

Kendra Little

dm_db_stats_properties

Thanks for visiting and I hope you learned something!

Is it Live or Is It Memorex?

If you are my age, that is a phrase you will remember.  If not, it is a slogan for Memorex from quite a few years ago for cassette tapes….here is a link to a YouTube video.

This post really isn’t about that…it is about live query execution plans that were introduced with SSMS 2016.  Basically live query execution plans allows you to view the plan live as it is executing.

Before getting into the details, let’s take a look how to turn it on.  Before starting your query, Live Query Statistics must be turned on.  There are two ways this can be done.  The first one is under the Query menu and the second method is on the tool bar.  The button is the one the arrow is pointing to.

Once this is turned on, you can now run your query.  When you do, a new tab will appear in the results pain, which will be labeled “Live Query Statistics”.

This is actually pretty cool to see.  The dotted lines represent the parts of the plan that are either currently executing or haven’t executed yet.  In addition to seeing the lines move, you will also see the number of rows increment and the number of seconds go up as well.  The time is circled below.

Regarding the number of rows, you will actually see two numbers, actual number of rows and the estimated number of rows.  You will see something like this:

As indicated, the green arrow is the actual number of rows and the brown arrow is the estimated number of rows.  If you move away from the operator and the pop up menu disappears, when you return the actual number of rows will actually increment as more rows are returned.

If you look at the properties of the operator, you will see the Actual Number of Rows and the Elapsed Time properties increment as the query executes.

You can also see Live Execution Plans by using Activity Monitor.  Once you have opened Activity Monitor, navigate to the Active Expensive Queries section and right click on the query.  You will see a pop menu that will have the “Show Live Execution Plan”.

At the bottom left of the new tab, you will see the progress.  In the case below, the progress is at 18% complete.

Brent Ozar has a few really good posts that go more in depth about Live Query Execution Plans.  I figured why reinvent the wheel.  Here are the links to his posts.

BrentO Post 1

BrentO Post 2

I would recommend that you try this if you haven’t yet.

Here is some code that you can use.

USE [AdventureWorks2014]

SELECT *
FROM Production.TransactionHistory th
INNER JOIN Production.TransactionHistoryArchive tha
ON th.Quantity = tha.Quantity

Thanks for visiting my blog…hope you enjoyed the post!

How can I tell if Statistics are created by the system

Column statistics are used by the optimizer to help create the best possible execution plan for a query.  There are a few ways these statistics get created.  The first method is when an index is created, which are updated when the index is optimized. The second method of creating statistics is by using the CREATE STATISTICS statement.  This statement allows the DBA to create statistics on a column that doesn’t have an index on it and allows the creator to provide a name of the statistics object.  The last method is to let SQL Server create the statistics when the column is read.

Using the AdventureWorks2014 database, if you run the code below you will see that there are not any statistics on the MakeFlag column in the Production.Project table.

SELECT OBJECT_NAME(object_id) AS [ObjectName]
,[name] AS [StatisticName]
,STATS_DATE([object_id], [stats_id]) AS [StatisticUpdateDate]
FROM sys.stats
WHERE OBJECT_NAME(object_id) = ‘Product’

I get the result set below.  But in looking at it, how can I tell that there really isn’t statistics on the MakeFlag column?

You can run the statement below that will list all the columns that have statistics.  This statement can be found at here on Microsoft’s web site.

SELECT s.name AS statistics_name
,c.name AS column_name
,sc.stats_column_id
FROM sys.stats AS s
INNER JOIN sys.stats_columns AS sc
ON s.object_id = sc.object_id AND s.stats_id = sc.stats_id
INNER JOIN sys.columns AS c
ON sc.object_id = c.object_id AND c.column_id = sc.column_id
WHERE s.object_id = OBJECT_ID(‘Production.product’);

Now that I have been able to check the columns, what are the objects that have the funny names?  Those are system generated statistics.  Let’s break down the name.

Using this name as an example: _WA_Sys_00000007_75A278F5

The WA really just stands for Washington…of course we know where that came from

The sys just means system

00000007 means it is the 7th column in the table. in this case the SafetyStockLevel column.

Below is a list of the column in the Production.Product table.  Notice that the SafetyStockLevel is the seventh column.

But what about this statistics object, _WA_Sys_0000000A_75A278F5.  Where is column A?  The means it is the 10th column as indicated below.  The A means that it is a hexadecimal value.

That takes care of the first three parts of the name.  The last item, 75A278F5 is a hexadecimal number of the object ID.  The image below you can see that the object ID is the same as 75A278F5 converted to a decimal number.


Now that we have broken down the name, let’s talk about how these actually get created.

If you run the statement below, it will create the statistics object on the MakeFlag column. Before showing you the results, let’s predict the name.

SELECT Name
, MakeFlag
FROM [Production].[Product]
WHERE Makeflag = 2

We know it will start with _WA_SYS_.  Since MakeFlag is the 4th column, we know the next part of the name is going to be 00000004.  This just leaves the last part, the hexadecimal version of the object id.  Using a converter at https://www.rapidtables.com/convert/number/decimal-to-hex.html.  we can take the object ID and convert it to 75A278F5.

Put all the parts together and you get _WA_Sys_00000004_75A278F5 as confirmed by the screenshot below.

After all that, the easiest way to find the system generated statistics is the run this statement:

SELECT *
FROM sys.stats

This statement will return a column named, auto_created.  This is another method of finding the system generated statistics.

Hope this helps clear up the names of system generate statistics.  Thanks for stopping by and hope you learned a little something.

Anatomy of a Select Statement Part 2

The Select statement is one of 4 Data Manipulation Language(DML), the others being INSERT, UPDATE and DELETE. As expected the SELECT statement is used to extract data from a relational database such as SQL Server.  This statement works in most database systems.  Although, there may be some differences in the different products. These differences are not the focus of this post, we will concentrate on the use of the SELECT statement in Microsoft SQL Server.  Over the years the SELECT statement has not changed much as versions of SQL Server are released, making this one of the statements that will work in older versions as well as the new ones. This is part 2 of the series.

The SELECT clause is just one of a few clauses in the SELECT statement.  The others are FROM, WHERE, GROUP BY, HAVING and ORDER BY.  In part 1 I discussed the order these clauses are processed.  As a reminder, the SELECT clause is either last or second to the last in the processing order.  If there is an ORDER BY, the SELECT is processed just before the ORDER BY.  If there is not an ORDER BY, then the SELECT is the last clause processed.  The focus of this post is just the SELECT clause.

The SELECT clause can easily be described as the place where you identify what columns you want to be included in the result set.  These columns can be columns from the table or view, an expression or the results of a scalar function.  In addition to what columns you want included in your result set, the SELECT clause is also where the order in which those columns appear in the result set is determined.

When deciding what columns from the tables to be included in the SELECT clause, it is important to only return the columns that are needed.  Avoid using “SELECT * FROM….”.  Sometimes it is simply easier to use SELECT *, however there are consequences to using it, mostly the potential impact on the performance of the query.   When listing the columns they need to separated by a comma.  Below you will find an example.  The “[” and “]” are used when the column name contains a space.  You will also see them if you right click on a table and click “Select Top x rows” in SSMS.

In addition to actual column names you can also use calculated columns in the SELECT clause. However, when this is done the column name in the result set will be (No Column Name) unless a column alias is used.

Result Set without a column alias:

The example below show how a column alias can be used.  The “AS” keyword isn’t required.  The single quotes surrounding “New Price” are also not required as long as the alias name does not include spaces.  Personally I also like to include them regardless if there is a space or not.  There really aren’t a lot of requirements for column aliases although I would try to avoid using keyword and actually column names.

Scalar Functions

Scalar functions can also be used in the SELECT clause.  These functions can be system supplies functions or user defined functions. Really the only requirement is that it must be a scalar function, not a table valued function. Just a reminder, a Scalar function is a function that returns a single value.  So when one is included in a SELECT statement, it will run the same number of times as there are rows returned.

Here are some examples of a few native SQL Server built-in functions. More information can be found here: More Info on Functions

LEFT RIGHT LEN
UPPER LOWER REVERSE
TRIM GETDATE DATEDIFF
UPPER LOWER TRY_CONVERT

When using these functions, many times the arguments will be populated with column names.  Although that is in no way a requirement.

Below you will find an example of using two Built-In functions. One, UPPER, requires an argument to be used while the second, GETDATE(), does not.  However, both would need a column alias.  This would be the same if using User Defined Functions(UDF).

Example using a UDF.  In this case it is the ufnGetSalesOrderStatusText UDF in the AdventureWorks database.  It accepts TINYINT value as an input parameter and will return the text description of the status.

If you run the above query without the TOP clause, you will see that this query returns over 31,000 records.  This means that the UDF will execute over 31,000 times. This is important to know in case there are performance issues.

TOP

We have discuss adding columns, column aliases as well as the use of scalar functions in the SELECT clause. Now lets talk about the TOP key word. This appears immediately after the SELECT key word.  This is used to limit the number of rows to a specific number, TOP(10) or percent of rows, TOP 10 PERCENT.  The TOP key word is usually used in conjunction with the ORDER BY clause.  If no ORDER BY clause is included, then SQL Server will per Microsoft,  “it returns the first N number of rows in an undefined order”.

In the example below, only 10 rows are return based on the oldest 10 records when sorted by the OrderData.  Remember, the default sort order is ascending.  Notice that there are not parenthesis around the number 10.  Again, not required at this point, but will be in the future.

Notice above that the OrderDate is the same.  That is because it only returned the top 10 records.  Well what if there are more than 10?  This is where WITH TIES comes in.  If you look at the example below you will see that there are 11 rows.  This is because the 11th row match the value of the 10th row.  WITH TIES will return if there are additional matches with the value in the last row.

$ROWGUID and $IDENTITY

These two functions are used with tables that have an Identify column or a column using the ROWGUIDCOL property.  With both $ROWGUID and $IDENTITY you actually don’t need to know the name of the columns.  If you attempt to use either of these on a table that does not have the proper column types will result in an invalid column error.

Notice in the example below, the first column and the last column in the result set are the same.  The $ROWGUID column, the first column, will also assume the column name of the source column as well. $Identify works in the same way.

DISTINCT

The last topic in the post is the use of the key word DISTINCT.  DISTINCT is used to eliminate duplicates in the result set.  Per Microsoft, NULLS are considered equal for the DISTINCT keyword.

If you look below you will see that the number of rows returned is 316.  If you add the keyword DISTINCT the number of rows returned drops to 89.  With this query, all duplicates based on Gender and Rate are removed.


During the post we have discussed many of the items that can be included in the SELECT clause. This includes column aliases, user defined functions, built-in functions, TOP and DISTINCTS keywords as well as a few other topics.

Hopefully you leaned something and thank you for visiting my blog. In the next post I will dissect the FROM clause.

Previous parts of the Series

Part 1 – Order of the Clauses in the SELECT statement

 

 

 

 

Where did it go???

Have you have ever looked for something and expected it to be there…and it wasn’t? If so, you will understand my day today. I have a SQL Server Agent job that uses VBScript in one of it’s job steps and we tried to move it to a SQL Server 2016 server.  It unexpectedly failed because the ActiveX Script job step type is no longer an option.

Below you will find a screenshot from a pre-2016 server.  As you can see, the job step type is there.

Now here is a screenshot of the job step properties from a SQL Server 2016 server.  Notice, NO script job step type.  However, you will see a PowerShell job step type that we can use instead.

Here is the link to Microsoft that describes other features that were removed from SQL Server 2016.

https://docs.microsoft.com/en-us/sql/database-engine/discontinued-database-engine-functionality-in-sql-server-2016?view=sql-server-2017

You can query the sysjobsteps table to find the jobs that have these types of job steps.

SELECT     j.name
, s.step_id
, s.step_name
, s.subsystem
FROM dbo.sysjobsteps s
         INNER JOIN sysjobs j
                 ON s.job_id = j.job_id

I have been telling my students for a number of years that as DBAs we will need to learn PowerShell someday. That day actually arrived a few years ago.  This is a great example of why we should know PowerShell.

I really try to keep up on the features that are removed from versions of SQL Server, however this one slipped through the cracks and I missed it.  I just really never expected it to be removed, although I do understand why MS did remove it.

 

Max Degree of Parallelism vs Cost Threshold for Parallelism

The title sounds like something that you would see for a boxing match.  However, in reality it is a misleading title.  The Max Degree of Parallelism and the Cost Threshold for Parallelism SQL Server settings actually work more together than they do against each other.

These two settings actually define the how many and the when in regards to parallel execution plans.  The Max Degree of Parallelism(MDop) simply defines the number of processors\cores that SQL Server will use when the optimizer determines parallelism is needed.  The Cost Threshold for Parallelism is cost threshold of when SQL Server will use parallelism.  The cost is the overall cost the optimizer determines for each query and SQL Server will use parallelism if the cost is above the threshold value.

The recommended settings for MDop is the number of cores not to exceed 8.  However, when setting this, it is important to continue to monitor the system to see if the change has caused an improvement. The default value is 0.

The recommended setting for the Cost Threshold for Parallelism is 25 to 50 and has a default value of 5.  Yes, the default setting is too low.  By changing this, SQL Server will reduce the number of smaller queries that may use parallelism.  Just like the MDop setting, it is important to monitor the server after making a change to see if an improvement is make.  There have been many times where simply changing this setting from the default, I have seen the CPU utilization drop from close to 100% to less than 10%.  This in no way means you will see the same improvement, just what I have seen.

Neither one of these settings require a reboot of the service when changing.  It is also important to keep in mind that if the MDop is set to 1, SQL Server will ignore the Cost Threshold for Parallelism setting.

Kendra Little has a very nice video that explains this more in depth.

MAXDOP of Confusion (Dear SQL DBA Episode 8)

 

This snippet of code can be used to query server configuration settings

SELECT CONVERT(VARCHAR(60), name) AS ‘Name’
, value
, description
FROM sys.configurations

Thanks for reading!!!!!

 

Anatomy of a SELECT Statement – Part 1

Well here it is, my first blog post.  In writing this I realized I can talk for hours on a topic and struggle when writing a blog post.  Writing blog posts is much harder than I had thought.  My plan is to start with something simple and work my way into more complex topics.  Luckily there are many great blogs out there in the SQL Server community that I can use as a model of how blogs should be done.

The Select statement is one of 4 Data Manipulation Language(DML), the others being INSERT, UPDATE and DELETE. As expected the SELECT statement is used to extract data from a relational database such as SQL Server.  This statement works in most database systems, although there may be some differences in different products. These differences are not the focus of this or future posts, we will concentrate on the use of the SELECT statement in Microsoft SQL Server.  Over the years the SELECT statement has not changed much as versions of SQL Server are released, making this one of the statements that will work in older versions as well as the new ones.  This is the first of several that will dissect the SELECT statement and provide information on each.  I will start with the order of the commands in a SELECT statement.

Order of the Commands in the SELECT Statement

The SELECT statement consists of multiple parts. Below you will see the order in which the statements must appear.

If they are not in the proper order, an error will be raised and the query will not be allow to process properly.

Below you will find the error you will see if the statements are not in the proper order:

Msg 156, Level 15, State 1, Line 9

Incorrect syntax near the keyword ‘FROM’.

Understanding this order is vital to understand why column aliases cannot be referenced in any statements besides the ORDER BY.  Just a quick reminder, column level aliases can be used on all columns, however calculated columns and columns that use functions will not have a column name unless an alias is used.

If you run the statement below you will receive this error:

Msg 207, Level 16, State 1, Line 2

Invalid column name ‘NewPrice’.

SELECT TOP 20 [ProductID]
,[Name]
,[ProductNumber]
,[MakeFlag]
,[FinishedGoodsFlag]
,[Color]
,ListPrice
,ListPrice * 2 AS ‘NewPrice’
FROM [AdventureWorks2014].[Production].[Product]
WHERE NewPrice > 100

The reason the error happens is because when the WHERE statement runs, the alias doesn’t exist yet.  With the above statement, the SELECT statement will actually run after the WHERE, therefore causing an error.

Well, how do I make it work you may ask?  The small snippet below shows how.  Rather than reference the alias in the WHERE clause, you will need to repeat the formula.  In this case the formula is the ListPrice Column times 2.

WHERE ListPrice * 2 > 100

However, you can reference the alias in the ORDER BY clause because it processes after the SELECT clause and the alias exists when the ORDER BY processes.

Hopefully you have made it this far!!!  As you can see in some situations, understanding the processing order of the commands in a SELECT statement is important.  I will cover the SELECT command in my next post.

Thanks for visiting my blog!!!