标签云

微信群

扫码加入我们

WeChat QR Code

What is the difference between UNION and UNION ALL?


This link may help you codefari.com/2015/10/what-is-difference-between-union-and.html

2019年05月27日02分39秒

w3schools.com/sql/sql_union.asp

2019年05月27日02分39秒

The implication of this, is that union is much less performant as it must scan the result for duplicates

2019年05月27日02分39秒

UNION ALL will indeed be more performant, specifically due to the lack of the distinct sort.My general practice is to use UNION ALL unless I specifically want duplicates.

2019年05月27日02分39秒

Just noticed that there are a lot of good comments/answers here, so I turned on the wiki flag and added a note about performance...

2019年05月27日02分39秒

UNION ALL can be slower than UNION in real-world cases where the network such as the internet, is a bottleneck. The cost of transferring many duplicate rows can exceed the query execution time benefit. This has to be analyzed on a case-by-case basis.

2019年05月27日02分39秒

AdamCaviness Your comment doesn't quite make sense.

2019年05月27日02分39秒

"all selected columns need to be of the same data type" -- actually, things aren't that strict (not a good thing from a relational model point of view!). The SQL standard says their respective column descriptor must be the same except in name.

2019年05月27日02分39秒

"In order to remove duplicates the result set must be sorted" -- maybe you have a particular vendor in mind but there are no vendor-specific tags on the question. Even if there was, could you prove thatduplicates cannot be removed without sorting?

2019年05月27日02分39秒

distinct will "implicitly" sort the results, because removing duplicates is quicker on a sorted set. this does not mean the returned resultset is actually sorted that way, but in most cases distinct (and therefore, UNION) will internally sort the set of results.

2019年05月27日02分39秒

This will omit rows and therefore fail to produce the expected result if a contains NULL values. Also, it still does not return the same result as a UNION - UNION also removes duplicates that are returned by the subqueries, whereas your approach won't.

2019年05月27日02分39秒

FrankSchmitt - thanks for this answer; this bit about subqueries is exactly what I wanted to know!

2019年05月27日02分39秒

Nitpick: UNION ALL isn't "offered" by T-SQL. UNION ALL is part of the ANSI SQL standard and not specific to MS SQL Server.

2019年05月27日02分39秒

The 'Nitpick' comment could imply that you can't use "Union All" in TSQL, but you can.Of course, the comment doesn't say that, but someone reading it may infer it.

2019年05月27日02分39秒

There is no SQL Server tag on this question. I think the option that returns duplicates just because it usual perform best is the wrong advice.

2019年05月27日02分39秒

onedaywhen I guess the OP used the phrase "SQL Servers" as a synonym for all RDBMSs (e.g. MySQL, PostGreSQL, Oracle, SQL Server). The wording is unfortunate, though (and of course, I might be mistaken).

2019年05月27日02分39秒

FrankSchmitt: none of the products you listed are truly RDBMSs :)

2019年05月27日02分39秒

onedaywhen care to elaborate? At least en.wikipedia.org/wiki/Relational_database_management_system seems to agree with me- it explicitly mentions Microsoft SQL Server, Oracle Database and MySQL. Or are you nitpicky about the difference between Oracle and Oracle Database e.g. ?

2019年05月27日02分39秒

Everything in this answer has been said already, is too confusing to be useful (suggesting joins over unions when they do different things, giving "blocking" as a reason without explaining what you mean by that or which database servers it applies to), or is highly misleading (your percentages in your screenshot are not applicable to real actual use of UNION/UNION ALL).

2019年05月27日02分39秒

Blocking Operators are well known operators in TSQL. Everything that blocking operators do can be achieved by Joins but not vice versa. Distinct Sort operation is circled in the picture to show why union all performs better than union and also to show exactly where it exists in the execution plan. Feel free to add more data to the tables T1 and T2 to play around with the percentages!

2019年05月27日02分39秒

You technically CAN produce the results of a union using a combination ofjoins and some really nasty cases, but it makes the query darn-near impossible to read and maintain, and in my experience it is also terrible for performance.Compare: select foo.bar from foo union select fizz.buzz from fizz against select case when foo.bar is null then fizz.buzz else foo.bar end from foo join fizz where foo.bar is null or fizz.buzz is null

2019年05月27日02分39秒

DBA Your answer is only relevant for users of MS SQL Server. The OP never mentioned the RDBMS they're using - they might be using MySQL, PostgreSQL, Oracle, SQLite, ...

2019年05月27日02分39秒

Your second picture suggests the two are mutually exclusive when they are not. The picture should rather show the same as the first but with the 'intersection ellipse' () shown a second time. Actually, on second thoughts, because the union all result is not a set, you should make no attempt to draw it using a Venn diagram!

2019年05月27日02分39秒

"Union Result set is sorted in ascending order" --Unless there is an ORDER BY, sorted results are not guaranteed. Maybe you have a particular SQL vendor in mind (even then, ascending order what exactly...?) but this question has no vendor=specific tags.

2019年05月27日02分39秒

"merges the contents of two structurally-compatible tables" -- I think you've stated this part really well :)

2019年05月27日02分39秒

Really? For a four-row result?! I would think this is a scenario where you would want to use UNION to convey intent (i.e. no duplicates) because UNION ALL is unlikely to given any real life performance gain in absolute terms.

2019年05月27日02分39秒

True ! UNION might change the order of the two sub-results.

2019年05月27日02分39秒

This is wrong. A UNION will NOT sort the result in ascending order. Any ordering you see in a result without using order by is pure coincidence. The DBMS is free to use any strategy it thinks is efficient to remove the duplicates. This might be sorting, but it could also be a hashing algorithm or something entirely different - and the strategy will change with the number of rows. A union that appears sorted with 100 rows might not be with 100.000 rows

2019年05月27日02分39秒

Without an ORDER BY clause on the query, the RDBMS is free to return the rows in any sequence. The observation that the result set from a UNION operation is returned "in ascending order" is only a byproduct of a "sort unique" operation performed by the database. The behavior observed is not guaranteed. So don't rely on it. If the specification is to return rows in a particular order, then add an appropriate ORDER BY clause.

2019年05月27日02分39秒

But your first chunk of results could be one row duplicated many times: how useful is that?!

2019年05月27日02分39秒

How does this add any value compared to the accepted answer?

2019年05月27日02分39秒

ok no problem ..

2019年05月27日02分39秒