- Published on
SQL's Order of Execution
- Authors
- Name
- Curtis Mitchell
- @curt_mitch
If you started learning SQL after having learned a language like Java or Python, it takes getting used to different concepts inherent in database querying, let alone the varied differences in the way SQL runs. Once you've started writing anything more complicated than the “Hello, World!” of SQL queries, meaning SELECT * FROM TABLE;
, then you've probably encountered errors that hint at one of the more unintuitive aspects of the language: the order of execution in a SQL statement.
As an example, say we're creating a derived column and then we want to group the results by that new column. In this code example we'll create a new column summing the value of multiple customer contracts from a hypothetical database:
SELECT
customer_id,
contract_1,
contract_2,
contract_3,
SUM(contract_1, contract_2, contract_3) AS contract_totals
FROM contract_table;
In the case where you want to order by the size of the contract totals, you would likely encounter an error stating that the column contract_totals
does not exist:
SELECT
customer_id,
contract_1,
contract_2,
contract_3,
SUM(contract_1, contract_2, contract_3) AS contract_totals
FROM contract_table
ORDER BY contract_totals;
It turns out that standard SQL runtimes will attempt to run a “group by” statement before the “select” statement, and in fact the “select” statement is one of the last parts of a SQL query that gets executed. The following is the order that a typical SQL statement is executed in with a brief description of that command:
FROM
: pick the table(s) to be queriedWHERE
: filter the rowsGROUP BY
: aggregate the rowsHAVING
: filter the aggregated rowsSELECT
: select the columns that appear in the outputORDER BY
: sort rows by value(s)LIMIT
: restrict the maximum number of returned rows
Unfortunately there's not a great way to work around this process and your best bet is to have this order of execution memorized. For instance, returning to our previous example with the contract amount summation, it's easiest to repeat the sum expression within the ORDER BY
statement:
SELECT
customer_id,
contract_1,
contract_2,
contract_3,
SUM(contract_1, contract_2, contract_3) AS contract_totals
FROM contract_table
ORDER BY sum(contract_1, contract_2, contract_3);
Even though this should set off every programmer's DRY alarm (DRY=“don't repeat yourself”), writing your SQL queries with the execution order in mind will reduce the number of errors you encounter, especially as you transition from a SQL newbie to a seasoned user!