TSQL Formatting – My Thoughts

During my SQL Server classes I am asked many times about how to format T-SQL.  If you do a Google search on How to Format T-SQL, you will get a ton of results.  There are many great posts on this topic. This post will identify some industry standards for formatting and my personal thoughts on formatting.

As you may know, formatting is very important.  If we have a production outage that is caused by a stored procedure, as the DBA you might be asked to review the code in the stored proc.  This is expected, however what might not be expected is poorly formatted code.  This poorly formatted code could lead to a longer outage because now you will need to take more time to read the code to figure out what it is doing.  This is true especially if you did not write the code or you did write the code but don’t remember it because it was a long time ago.

So here are my thoughts……

Capitalize all Keywords

While this in no way is a requirement, I believe that by doing so the code is cleaner and easier to read.  I also think the keywords jump out more when they are in caps.

select *
from production.Product
where ProductID > 350

With keywords in Caps:

SELECT *
FROM production.Product
WHERE ProductID > 350

Alias All Tables

I like to alias all tables, even if the query only uses only one table. The reason for using an alias even with one table is that if that query evolves into a query with more than one table, that initial table already has an alias and is set up for the addition of more tables.

When I use a table alias, I have two simple rules I follow.

          • It needs to be somewhat descriptive – The reason for this is straight forward.  I feel it makes it easier to determine which table all the columns are coming from.
          • All aliases should have the same number of characters – When I write T_SQL code, I find it easier to read if the dots between the alias  and the column name.  If the alias is the same length this is easier to do.

The code below has three table aliases, all different length.  To me it just seems busier and more difficult to read.

Alias All Columns

When there is a table alias, we should be using it for all columns.  When a column exists in both tables in the join, you will get error like below.

Msg 209, Level 16, State 1, Line 2
Ambiguous column name ‘ProductID’.

Using an alias on all columns, you can prevent this error.  You can also make is easier to figure out what table each column comes from.  Using the example below, it can be a bit challenging to figure out which table the ListPrice column comes from.  When looking at the tables involved in the query, it could logically come from two tables, Production.Product and Sales.SalesOrderDetail.  Because of the lack of an alias, this task becomes more difficult.

Have Vertical Lines

I like to vertically line up elements of the statements.  If you look below you will see an example of what my code will typically look like.

By doing this, for me it is easier to identify which SELECT, FROM and WHERE are part of the same statement.  As well as which ON goes with which JOIN.  I also feel that by doing this, the code is cleaner and easier to read.

Place Column Names Vertically

Over years I have had to review code that was written by others, and honestly sometimes myself, that placed the columns horizontally.  Similar to the first image below.  What I have found is that by placing the columns in this manner, it becomes more difficult to review the code.  Especially if there is a function or a CASE statement.

By doing this, it will be easier to add or remove columns from the result set.

Place the Comma before the Column Name

If you can place the comma before, of course you can also place it after the column name.  While some believe placing it after is the way to go, I have found that placing it before the column name works better for me.  While you may develop your own preference, I think the most important thing here is that you have a standard and be consistent following it.

Looking at the example below, you can see that by having the commas at the front it is a bit easier to comment out a column.  The only except would be the first column.  If you comment out the first column, there is still the comma at the front of the second line that will cause an error.

Steps in the FROM and WHERE

When working in the FROM or WHERE clauses I like to format the code in a way that resembles steps.  I like to line up the INNER or OUTER key words in the join, but on the ON keyword.

By doing this, I have found it easier to pair up the ON and the JOINs.

I also like to do something similar in the WHERE clause.  By placing the AND keywords on different lines and off setting them, again similar to stairs, I think it is easier to read what the criteria for the query is.

Derived Tables

Sub-queries are not usually at the top of my list of potential solutions, however derived tables I look at a bit differently.  Because of this, I have used them from time to time in production code.  Just like another code, they should be formatted in a way that allows it to be easily read.

Kathi Kellenberger defines a derived table in her post at Apress.com in this way:

“Derived tables are subqueries that are used in the FROM clause instead of named tables”

When writing the code for a derived table, I still try to follow all the same rules, especially since it is still a SELECT statement.

CASE Statement

When using a CASE statement, I like to line up the WHEN keywords as well as the CASE and END.  As seen below.  I feel this just makes it easier to see your options in the CASE.

Indent

I think indenting is very important for any well formatted T-SQL code.  This can easily be done by using the TAB key.  There is a setting in both SSMS and Azure Data Studio that will define the number of spaces the cursor will move each time you press TAB.  The default is 4.  I find this to be the ideal number.  I have seen code that has used 2 and for me, 4 makes it easier to see the indent.  Therefore, in my opinion the code is easier to read.

In SQL Server Management Studio, you can set the number of spaces in the Options, which can be found under that Tools menu.

While in Azure Data Studio, this same setting can be found in the Preferences, which can be found under the File menu.

UNION\UNION ALL

With these two statements, I would usually follow all the formatting guidelines for a SELECT statement.  I do like to place a line before and after the UNION or UNION ALL.  I feel this makes it easier to identify the two statements.

SELECT Name
             , City
             , State
             , ‘Customer’ AS ‘Source’
FROM Customers

UNION

SELECT Name
             , City
             , State
             , ‘Employees’
FROM Employees

Comments

Comments are critical to any well written code.  T-SQL must be self-documented. Comments is how this can be accomplished.

There are two methods you can use to comment your code.  The first is by using two dashes.  This will comment out any code that is to the right of the two dashes.  In the image below, there are two examples of comments using the dashes.

The second method is to use a set of characters, /* and */.  Any text between these will be commented out.  As you can see below.

When I really want a comment to pop out, I like to use the * to define a start and end of the comment.  I like to do this more so when the code or procedure is very long.  I think this is a great way to break a larger block of code into more readable sections.

/*****************************************************

Insert your comment code here.

*****************************************************/

If you are creating a stored procedure, it should include a “flower box”.  This is a part of the code that provides critical information about the stored procedure.

I like to flower box to include the following information

          • Procedure Name
          • Date created
          • Who created it
          • Purpose
          • List of Parameters
          • Sample code
          • Historical modifications

Below is an example:

Calling a Stored Procedure

Calling a stored procedure is usually a relatively simple piece of code to write.  Especially if there are not any parameters involved.

If no parameters are involved, this code will just be a single line.

EXEC StoredProcName

However, if there are parameters involved, you have a few options to consider.  Parameters can be called either by name or by position.  My preference here is to call by name.

When calling by position, this is how the code would look.  By looking at it, you can probably assume that the 34 is the customerID and the dates are the start and end dates for the range. I have found that assuming something gets me “unexpected results” sometimes, so I don’t like to assume.

EXEC GetSalesByCustAndDate 34, ’02/01/2020′, ’02/29/2020′

I find that calling the parameters by name works better for me.  I also like to place each parameter on a separate line with the @ lined up.  While this in no way a requirement, it just works for me.

EXEC GetSalesByCustAndDate
                 @CustID = 34,
                 @StartDate = ’02/01/2020′
                 @EndDate =  ’02/29/2020′

Tools to help format

PoorSQL Formatter – This is an extension for Azure Data Studio.  I really like this extension and it is very easy to use.  In order to utilize this, you will need Azure Data Studio and the extension is self.  When you use this extension, there are a number of settings in Azure Data Studio that you can define the rules for PoorSQL Formatter to follow. Here is a link to my blog post in the topic.

In addition to being an extension for ADS, there is also a website that you can use.  Like the extension, you can define some of the rules it will follow when formatting code.  Here is the link.

I think it is important to mentions, both the extension and the web site have a great price…..they are FREE!!!

Redgate SQL Prompt – This tool is very nice for formatting.  It allows the developer to utilize pre-defined styles or you can create your own.  SQL Prompt will be installed right into SQL Server Management Studio.  After the installation is complete, you will see a new menu item, SQL Prompt. When open, you will see a number of menu items that allow you to access the functionality.

This is a great tool!!!  While there is a cost with this tool, it is definitely worth it.

SQL Prompt can be downloaded here, https://www.red-gate.com.

Code Beautifier – This is a nice online tool. Here is the link to this tool, https://codebeautify.org/sqlformatter.  This tool is also available as an extension in Visual Studio Code.

As with many of the online formatters, there are options.  In this case, you can Beautify your code as in the image below.

Or you can “minify” your code as in the image below.

Additional Websites for formatting – these are just a few sites I found on line.  Please review them carefully and use at your own risk.

http://www.dpriver.com/pp/sqlformat.htm

https://sqlformat.org/

While following industry standards is important, it is also important just to have a standard any standard.  Hopefully, the standard you follow will have it’s roots in what considered best practice..

These are in no way an all-inclusive list.  There are other guidelines as well.  These are just the basic ones I have followed for a number of years. I in no way comes up with these, these are the guidelines that I have learned over the years from many different sources. Again, these are just my thoughts and have worked for me for quite a few years.

Thanks for visiting my blog!!!

ADS: TSQL Checker

Azure Data Studio has quite a few extensions that are very useful.  Over the past six to eight months I have completed a number of posts on quite a few of these extensions.  So far one of my favorite extension is the PoorSQL Formatter.  It provides great assistance in the formatting of Transaction SQL code. What it doesn’t do, is help identify code that is not in line with best practices.  This is where the TSQL Checker comes into play.

This extension was developed by Daniel Janick. Daniel is a Microsoft MVP from the Austin, TX area.  He can be found at @SQLSME on Twitter.

You might be wondering exactly what does this extension check.  According to the extension documentation this is what it checks:

“TSQL checks for hints, select * from options”

Of course “SELECT *” is a very well known problem in query design.  Hints and options can also create significant performance issues as well.   What exactly is a hint?  According to Microsoft’s documentation a hint is something that will over ride the execution plan that the query optimizer might select for a query. Here is a link to Microsoft’s documentation on hints.  Hints will act on the four DML statements, SELECT, INSERT, UPDATE and DELETE.  Hints should only be used as a last resort and only by experienced developers or DBAs.

While this extension is still in preview mode, it can still be very useful.  To install it you will click the “Install” button, green button below.  This will appear when you are looking at the documentation of the extension.  When you do, you will be asked to open the GitHub sight for this extension.

When you click open, you will be taken to the GitHub site for this extension.  You will want to download the tsqlchecker-0.0.1.vsix file.  It is highlighted below.  When saving it, save it in a location where it will not be overwritten.

To install this extension, simply go to the File menu and then go to “Install extension from Vsix file”.  While some extensions require a restart of Azure Data Studio, this extension does not.

The purpose of this extension is to help you identify code that is not inline with best practices.  In order to do so, the extension needs to mark the offending code in some manner.  The extension will do this one of three ways.

    1.  Highlighting the code with a box:
    2.  Highlighting the code in yellow:
    3. Highlighting the code in red:

Let’s take a look at some of the code that the TSQL Checker will flag.

Note: This code was take from the Microsoft documentation.  This really isn’t a post on what these things do, just how TSQL Checker flags them.  Refer to the Microsoft documentation for more information on these.

SELECT *

MAXDOP

FORCESEEK

FORCESCAN

LOOP JOIN

MERGE JOIN

OPTIMIZE FOR

HASH JOIN

For all of these warnings, the developer has also added a popup that will provide additional information.  When you float the cursor over the warning you will see somethings similar as you is in the image below.

I really like this extension.  It reminds us of code that is less than ideal and should be reviewed.  Remember, few if any of these hints are recommended for production use.  If you run into a situation that you think these are part of the solution, just do your homework first.  Research and see if there is a better way to address the problem.

Thanks for visiting my blog!!!