作者:mobiledu2502915233 | 来源:互联网 | 2023-07-09 19:24
本文由编程笔记#小编为大家整理,主要介绍了sql 查询性能优化相关的知识,希望对你有一定的参考价值。
/**************************
SEARCH ARGUMENT ABLE QUERIES (SARGABLE)
***************************/
/*
SARGable is defined as “In relational databases, a condition (or predicate) in a query is said to
be sargable if the DBMS engine can take advantage of an index to speed up the execution of the query.
The term is derived from a contraction of Search ARGument ABLE”
Sargable queries are those which are able to use created indexes on them for faster searches and
execution of a query. A faster search means making an effective index seek out large numbers of
rows and avoiding costly index scans.
*Index Seek* – queries are able to use indexes effectively and locate rows with less effort from the
query optimizer.
*Index Scan* – scanning the entire table to locate rows to satisfy the search
criteria.
What Makes a Query Non-Sargable (not able to use created indexes effectively)?
1. using functions in Where clause conditions (because a function is evaluated against each row which
forces the query optimizer not to use the index)
2. using LIKE ‘%Proposal%’ in Wild card search queries
3. performing arithmetic calculation on an index column in a Where clause
Here are some main situations that lead to non-sargable queries, and how you can avoid them:
*/
-- 1. USING A FUNCITON IN WHERE CLAUSE TO SEARCH A STRING
--Non-sargable query because of function Used in WHERE clause
SELECT EmployeeName
FROM EmployeeTest
WHERE LEFT( EmployeeName,1 ) = 'A';
--Better sargable query that instead uses LIKE with a prefix (ie no wildcard at beginning of LIKE - '%A%')
SELECT EmployeeName
FROM EmployeeTest
WHERE EmployeeName LIKE 'A%';
-- 2. USING A FUNCITON IN WHERE CLAUSE TO SEARCH A DATE COLUMN
--Non-sargable query because of function used in WHERE clause
SELECT DateOfBirth
FROM EmployeeTest
WHERE YEAR (DateOfBirth) = '1952';
--Better sargable query that doesn't use function and instead searches within a consecutive range
SELECT DateOfBirth
FROM EmployeeTest
WHERE DateOfBirth >= &#39;19520101&#39; AND DateOfBirth <&#39;19530101&#39;;
-- 3. USING CALCULATIONS IN THE WHERE CLAUSE
--Non-sargable query because of calculation done on index column in WHERE clause
SELECT Salary
FROM EmployeeTest
WHERE Salary / 2 = 50147 ;
--Better, sargable query where calculation is done separately and then evaluated against index column
--(instead of actually calculating against the index column, which would manipulate that column
--and make it so SQL Server can&#39;t trust the order of the index)
SELECT Salary
FROM EmployeeTest
WHERE Salary = (50147 * 2);
-- 4. USING ISNULL() IN THE WHERE CLAUSE
--Non-sargable query because of ISNULL function on Index Column in WHERE Clause
SELECT EmployeeName
FROM EmployeeTest
where ISNULL (EmployeeName, &#39;Vru&#39;) = &#39;Vru&#39; ;
--Better, sargable query that doesn&#39;t use funciton and instead uses OR to test if row IS NULL
SELECT EmployeeName
FROM EmployeeTest
where EmployeeName = &#39;Vru&#39; OR EmployeeName IS NULL;
/**************************
SUBQUERY VS JOIN PERFORMANCE
***************************/
/*
Correlated subqueries and JOINs can return the same results. Sometimes subqueries will work better
than JOINs in terms of performance, and sometimes the other way around. You can figure out which works
better in certain instances by trying both methods in the same batch, and looking at the execution plan
to see what percentage of the work was consumed by each method. The method that used less of a percentage
will be the more efficient method.
As an example as to situations where JOINs perform better, for subqueries that need to apply computations
such as aggregates based on teh same set of rows, SQL Server will performan a separate access to the data
for each subquery. With a JOIN, you can apply multiple aggregate calucaltions based on the same access to
the data, and therefore the JOIN will be more efficient.
*/
-- Less efficient SUBQUERY example of multiple aggregate calculations being applied to the same rows
SELECT appeal_no
, campaign_no
, description
, total_tck_income
, total_tck_income / (SELECT sum(b.total_tck_income) FROM T_APPEAL as b WHERE b.campaign_no = a.campaign_no) AS PercentIncome -- correlated subquery, acting on same rows
, total_tck_income - (SELECT avg(c.total_tck_income) FROM T_APPEAL as c WHERE c.campaign_no = a.campaign_no) AS DiffFromAverage -- second correlated subquery, acting on same rows
FROM T_APPEAL AS a
WHERE total_tck_income <> 0
ORDER by campaign_no
-- More efficient JOIN example of the same query where multiple aggregate calculations are applied to the same rows
SELECT a.appeal_no
, a.campaign_no
, a.description
, a.total_tck_income
, total_tck_income / totalticket AS PercentIncome
, total_tck_income - averageticket AS DiffFromAverage
FROM T_APPEAL a
JOIN (SELECT campaign_no, SUM(total_tck_income) AS totalticket -- JOIN using subquery
, AVG(total_tck_income) as averageticket
FROM T_APPEAL GROUP by campaign_no) as b
ON a.campaign_no = b.campaign_no
WHERE total_tck_income <> 0
ORDER by campaign_no
/*
As an example of when a subquery will be more optimal than a JOIN, consider an example where you want
to pull all customers who do not have a contribution. The important index is the customer_no index in T_CUSTOMER.
The reason the subquery below is more efficient is due to the nuance of the optimzier. The bottom line
is that you should test the subquery method vs. the JOIN method to see which functions more efficiently
before putting into production.
*/
-- More efficient subquery method
SELECT * from T_CUSTOMER a
WHERE NOT EXISTS -- where NOT EXISTS the customer_no being in T_CONTRIBUTION
(SELECT * from T_CONTRIBUTION b WHERE a.customer_no = b.customer_no)
-- Less efficient JOIN method that returns the same results
SELECT a.customer_no from T_CUSTOMER a
LEFT JOIN T_CONTRIBUTION b on a.customer_no = b.customer_no
WHERE b.ref_no IS NULL -- similar to the NOT EXISTS from subquery example above
/**************************
TABLE EXPRESSIONS VS. TEMPORARY TABLE VS. TABLE VARIABLE PERFORMANCE
***************************/
/*
There are cases where a table expression is more optimal than a temporary table. This is the case
when you need to query the results only once - it doesn&#39;t make sense to waste the resources to persist
the results to a table when you&#39;re only using it once.
Temporary tables often have better performance when you have a starting table result that you need
to access repeated times, especially if it&#39;s a large or expensive result. If you use a derived
table and reference it multiple times in the outer query, the work is repeated each time. Therefore,
if you have an expensive inner query, it&#39;s best to instead persist this result to a temporary table,
and then reference the temporary table as many times as you need. Each time you reference the
temporary table, you will not be regenerating the result each time as you would be if you
used a dereived table.
The difference between a temporary table and a table variable is that SQL SERVER maintains full
statistics on temporary tables, but does not on table variables. Therefore, cardinality
estimates (estimates for row counts) tend to be more accurate with temporary tables. Therfore the
recommendation is that with small amounts of data with just a few rows, use table variables. With
larger tables sizes, the recommendation is to use temporary tables.
*/