During my SQL Server classes I am asked many times about how to format T-SQL. If you do a Google search on How to Format T-SQL, you will get a ton of results. There are many great posts on this topic. This post will identify some industry standards for formatting and my personal thoughts on formatting.
As you may know, formatting is very important. If we have a production outage that is caused by a stored procedure, as the DBA you might be asked to review the code in the stored proc. This is expected, however what might not be expected is poorly formatted code. This poorly formatted code could lead to a longer outage because now you will need to take more time to read the code to figure out what it is doing. This is true especially if you did not write the code or you did write the code but don’t remember it because it was a long time ago.
So here are my thoughts……
Capitalize all Keywords
While this in no way is a requirement, I believe that by doing so the code is cleaner and easier to read. I also think the keywords jump out more when they are in caps.
where ProductID > 350
With keywords in Caps:
WHERE ProductID > 350
Alias All Tables
I like to alias all tables, even if the query only uses only one table. The reason for using an alias even with one table is that if that query evolves into a query with more than one table, that initial table already has an alias and is set up for the addition of more tables.
When I use a table alias, I have two simple rules I follow.
- It needs to be somewhat descriptive – The reason for this is straight forward. I feel it makes it easier to determine which table all the columns are coming from.
- All aliases should have the same number of characters – When I write T_SQL code, I find it easier to read if the dots between the alias and the column name. If the alias is the same length this is easier to do.
The code below has three table aliases, all different length. To me it just seems busier and more difficult to read.
Alias All Columns
When there is a table alias, we should be using it for all columns. When a column exists in both tables in the join, you will get error like below.
Msg 209, Level 16, State 1, Line 2
Ambiguous column name ‘ProductID’.
Using an alias on all columns, you can prevent this error. You can also make is easier to figure out what table each column comes from. Using the example below, it can be a bit challenging to figure out which table the ListPrice column comes from. When looking at the tables involved in the query, it could logically come from two tables, Production.Product and Sales.SalesOrderDetail. Because of the lack of an alias, this task becomes more difficult.
Have Vertical Lines
I like to vertically line up elements of the statements. If you look below you will see an example of what my code will typically look like.
By doing this, for me it is easier to identify which SELECT, FROM and WHERE are part of the same statement. As well as which ON goes with which JOIN. I also feel that by doing this, the code is cleaner and easier to read.
Place Column Names Vertically
Over years I have had to review code that was written by others, and honestly sometimes myself, that placed the columns horizontally. Similar to the first image below. What I have found is that by placing the columns in this manner, it becomes more difficult to review the code. Especially if there is a function or a CASE statement.
By doing this, it will be easier to add or remove columns from the result set.
Place the Comma before the Column Name
If you can place the comma before, of course you can also place it after the column name. While some believe placing it after is the way to go, I have found that placing it before the column name works better for me. While you may develop your own preference, I think the most important thing here is that you have a standard and be consistent following it.
Looking at the example below, you can see that by having the commas at the front it is a bit easier to comment out a column. The only except would be the first column. If you comment out the first column, there is still the comma at the front of the second line that will cause an error.
Steps in the FROM and WHERE
When working in the FROM or WHERE clauses I like to format the code in a way that resembles steps. I like to line up the INNER or OUTER key words in the join, but on the ON keyword.
By doing this, I have found it easier to pair up the ON and the JOINs.
I also like to do something similar in the WHERE clause. By placing the AND keywords on different lines and off setting them, again similar to stairs, I think it is easier to read what the criteria for the query is.
Sub-queries are not usually at the top of my list of potential solutions, however derived tables I look at a bit differently. Because of this, I have used them from time to time in production code. Just like another code, they should be formatted in a way that allows it to be easily read.
Kathi Kellenberger defines a derived table in her post at Apress.com in this way:
“Derived tables are subqueries that are used in the FROM clause instead of named tables”
When writing the code for a derived table, I still try to follow all the same rules, especially since it is still a SELECT statement.
When using a CASE statement, I like to line up the WHEN keywords as well as the CASE and END. As seen below. I feel this just makes it easier to see your options in the CASE.
I think indenting is very important for any well formatted T-SQL code. This can easily be done by using the TAB key. There is a setting in both SSMS and Azure Data Studio that will define the number of spaces the cursor will move each time you press TAB. The default is 4. I find this to be the ideal number. I have seen code that has used 2 and for me, 4 makes it easier to see the indent. Therefore, in my opinion the code is easier to read.
In SQL Server Management Studio, you can set the number of spaces in the Options, which can be found under that Tools menu.
While in Azure Data Studio, this same setting can be found in the Preferences, which can be found under the File menu.
With these two statements, I would usually follow all the formatting guidelines for a SELECT statement. I do like to place a line before and after the UNION or UNION ALL. I feel this makes it easier to identify the two statements.
, ‘Customer’ AS ‘Source’
Comments are critical to any well written code. T-SQL must be self-documented. Comments is how this can be accomplished.
There are two methods you can use to comment your code. The first is by using two dashes. This will comment out any code that is to the right of the two dashes. In the image below, there are two examples of comments using the dashes.
The second method is to use a set of characters, /* and */. Any text between these will be commented out. As you can see below.
When I really want a comment to pop out, I like to use the * to define a start and end of the comment. I like to do this more so when the code or procedure is very long. I think this is a great way to break a larger block of code into more readable sections.
Insert your comment code here.
If you are creating a stored procedure, it should include a “flower box”. This is a part of the code that provides critical information about the stored procedure.
I like to flower box to include the following information
- Procedure Name
- Date created
- Who created it
- List of Parameters
- Sample code
- Historical modifications
Below is an example:
Calling a Stored Procedure
Calling a stored procedure is usually a relatively simple piece of code to write. Especially if there are not any parameters involved.
If no parameters are involved, this code will just be a single line.
However, if there are parameters involved, you have a few options to consider. Parameters can be called either by name or by position. My preference here is to call by name.
When calling by position, this is how the code would look. By looking at it, you can probably assume that the 34 is the customerID and the dates are the start and end dates for the range. I have found that assuming something gets me “unexpected results” sometimes, so I don’t like to assume.
EXEC GetSalesByCustAndDate 34, ’02/01/2020′, ’02/29/2020′
I find that calling the parameters by name works better for me. I also like to place each parameter on a separate line with the @ lined up. While this in no way a requirement, it just works for me.
@CustID = 34,
@StartDate = ’02/01/2020′
@EndDate = ’02/29/2020′
Tools to help format
PoorSQL Formatter – This is an extension for Azure Data Studio. I really like this extension and it is very easy to use. In order to utilize this, you will need Azure Data Studio and the extension is self. When you use this extension, there are a number of settings in Azure Data Studio that you can define the rules for PoorSQL Formatter to follow. Here is a link to my blog post in the topic.
In addition to being an extension for ADS, there is also a website that you can use. Like the extension, you can define some of the rules it will follow when formatting code. Here is the link.
I think it is important to mentions, both the extension and the web site have a great price…..they are FREE!!!
Redgate SQL Prompt – This tool is very nice for formatting. It allows the developer to utilize pre-defined styles or you can create your own. SQL Prompt will be installed right into SQL Server Management Studio. After the installation is complete, you will see a new menu item, SQL Prompt. When open, you will see a number of menu items that allow you to access the functionality.
This is a great tool!!! While there is a cost with this tool, it is definitely worth it.
SQL Prompt can be downloaded here, https://www.red-gate.com.
Code Beautifier – This is a nice online tool. Here is the link to this tool, https://codebeautify.org/sqlformatter. This tool is also available as an extension in Visual Studio Code.
As with many of the online formatters, there are options. In this case, you can Beautify your code as in the image below.
Or you can “minify” your code as in the image below.
Additional Websites for formatting – these are just a few sites I found on line. Please review them carefully and use at your own risk.
While following industry standards is important, it is also important just to have a standard any standard. Hopefully, the standard you follow will have it’s roots in what considered best practice..
These are in no way an all-inclusive list. There are other guidelines as well. These are just the basic ones I have followed for a number of years. I in no way comes up with these, these are the guidelines that I have learned over the years from many different sources. Again, these are just my thoughts and have worked for me for quite a few years.
Thanks for visiting my blog!!!