LEFT JOIN performance for WHERE on secondary table
I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.
However recently I ran across a performance thought:
Say I have table user with millions of rows and table gender with only 2 rows
user gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records
If I'm doing
select * from user u left join gender g on u.genderId = g.id where g.show = 1;
Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?
Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?
If the behavior is dependent on engine type, let's talk about MySQL InnoDB.
mysql
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.
However recently I ran across a performance thought:
Say I have table user with millions of rows and table gender with only 2 rows
user gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records
If I'm doing
select * from user u left join gender g on u.genderId = g.id where g.show = 1;
Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?
Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?
If the behavior is dependent on engine type, let's talk about MySQL InnoDB.
mysql
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
2
Show theEXPLAIN query
output. Even if you have a gender.id pinned at 2,genderId
in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used.LEFT
vsRIGHT
won't matter at all.LEFT JOIN
should be used when there isn't always RHS match, rather than just one to many, even a basicINNER JOIN
does many to many.
– danblack
Oct 19 '18 at 1:59
Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.
– Akina
Oct 19 '18 at 5:27
2
Your where condition on the outer joined table turns the outer join back into an inner join
– a_horse_with_no_name
Oct 19 '18 at 6:40
DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operatesets
at all. All we can do is to emulate thesets
bylists
and perform thesearch
instead offetch
. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"
– Kondybas
Oct 19 '18 at 12:01
What do you want to happen whenuser.gender_id = 3
and there is no corresponding row in the tablegender
.
– Rick James
Oct 22 '18 at 4:16
add a comment |
I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.
However recently I ran across a performance thought:
Say I have table user with millions of rows and table gender with only 2 rows
user gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records
If I'm doing
select * from user u left join gender g on u.genderId = g.id where g.show = 1;
Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?
Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?
If the behavior is dependent on engine type, let's talk about MySQL InnoDB.
mysql
I've been developing for years and through out the carrier I've been using only LEFT JOIN as it provides a more logical sense of mapping my one to many data.
However recently I ran across a performance thought:
Say I have table user with millions of rows and table gender with only 2 rows
user gender
===================== ================
id name genderId id gender show
===================== ================
1 Sam 1 1 Male 0
2 Samantha 2 2 Female 1
3 Another millions of records
If I'm doing
select * from user u left join gender g on u.genderId = g.id where g.show = 1;
Does SQL engine joins the whole million user records first, then scan the million rows of records for join projected value?
Or is the engine smart enough to query and join only the rows with correct gender? Or is RIGHT JOIN the correct method to provide best performance?
If the behavior is dependent on engine type, let's talk about MySQL InnoDB.
mysql
mysql
edited Oct 19 '18 at 6:40
a_horse_with_no_name
40.2k777112
40.2k777112
asked Oct 19 '18 at 1:27
Chor Wai ChunChor Wai Chun
1012
1012
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 8 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
2
Show theEXPLAIN query
output. Even if you have a gender.id pinned at 2,genderId
in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used.LEFT
vsRIGHT
won't matter at all.LEFT JOIN
should be used when there isn't always RHS match, rather than just one to many, even a basicINNER JOIN
does many to many.
– danblack
Oct 19 '18 at 1:59
Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.
– Akina
Oct 19 '18 at 5:27
2
Your where condition on the outer joined table turns the outer join back into an inner join
– a_horse_with_no_name
Oct 19 '18 at 6:40
DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operatesets
at all. All we can do is to emulate thesets
bylists
and perform thesearch
instead offetch
. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"
– Kondybas
Oct 19 '18 at 12:01
What do you want to happen whenuser.gender_id = 3
and there is no corresponding row in the tablegender
.
– Rick James
Oct 22 '18 at 4:16
add a comment |
2
Show theEXPLAIN query
output. Even if you have a gender.id pinned at 2,genderId
in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used.LEFT
vsRIGHT
won't matter at all.LEFT JOIN
should be used when there isn't always RHS match, rather than just one to many, even a basicINNER JOIN
does many to many.
– danblack
Oct 19 '18 at 1:59
Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.
– Akina
Oct 19 '18 at 5:27
2
Your where condition on the outer joined table turns the outer join back into an inner join
– a_horse_with_no_name
Oct 19 '18 at 6:40
DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operatesets
at all. All we can do is to emulate thesets
bylists
and perform thesearch
instead offetch
. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"
– Kondybas
Oct 19 '18 at 12:01
What do you want to happen whenuser.gender_id = 3
and there is no corresponding row in the tablegender
.
– Rick James
Oct 22 '18 at 4:16
2
2
Show the
EXPLAIN query
output. Even if you have a gender.id pinned at 2, genderId
in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT
vs RIGHT
won't matter at all. LEFT JOIN
should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN
does many to many.– danblack
Oct 19 '18 at 1:59
Show the
EXPLAIN query
output. Even if you have a gender.id pinned at 2, genderId
in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used. LEFT
vs RIGHT
won't matter at all. LEFT JOIN
should be used when there isn't always RHS match, rather than just one to many, even a basic INNER JOIN
does many to many.– danblack
Oct 19 '18 at 1:59
Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.
– Akina
Oct 19 '18 at 5:27
Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.
– Akina
Oct 19 '18 at 5:27
2
2
Your where condition on the outer joined table turns the outer join back into an inner join
– a_horse_with_no_name
Oct 19 '18 at 6:40
Your where condition on the outer joined table turns the outer join back into an inner join
– a_horse_with_no_name
Oct 19 '18 at 6:40
DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate
sets
at all. All we can do is to emulate the sets
by lists
and perform the search
instead of fetch
. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"– Kondybas
Oct 19 '18 at 12:01
DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate
sets
at all. All we can do is to emulate the sets
by lists
and perform the search
instead of fetch
. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"– Kondybas
Oct 19 '18 at 12:01
What do you want to happen when
user.gender_id = 3
and there is no corresponding row in the table gender
.– Rick James
Oct 22 '18 at 4:16
What do you want to happen when
user.gender_id = 3
and there is no corresponding row in the table gender
.– Rick James
Oct 22 '18 at 4:16
add a comment |
1 Answer
1
active
oldest
votes
Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.
Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.
To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:
select *
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f220533%2fleft-join-performance-for-where-on-secondary-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.
Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.
To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:
select *
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
add a comment |
Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.
Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.
To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:
select *
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
add a comment |
Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.
Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.
To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:
select *
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]
Remember how SQL works: construct a set of tuples (rows) based on the given tables and joins, then filter those based on any where clause.
Because you have a where condition on the "right" table, your outer join is effectively converted into an inner join.
To retain the "outer-ness" of the join, put the filter condition into the join construct, rather than in the where clause:
select *
from user u
left join gender g
on u.genderId = g.id
and g.show = 1
[where ...]
answered Oct 19 '18 at 11:18
Phill W.Phill W.
80633
80633
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
add a comment |
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
that;s no a requirement of how it works, the planner is free to reorganise the query so long as the same set of results come out (modulo extensions like incomplete group bys)
– Jasen
Oct 19 '18 at 11:26
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
@Jasen By index hints we can force the planner to do exactly we want.
– Kondybas
Oct 19 '18 at 11:53
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f220533%2fleft-join-performance-for-where-on-secondary-table%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
Show the
EXPLAIN query
output. Even if you have a gender.id pinned at 2,genderId
in user may not be indexed, and even if it was it would be a large proportion of a secondary index, and you still need to look up name and id, so it might not be used.LEFT
vsRIGHT
won't matter at all.LEFT JOIN
should be used when there isn't always RHS match, rather than just one to many, even a basicINNER JOIN
does many to many.– danblack
Oct 19 '18 at 1:59
Your join is INNER JOIN in practice, because a condition by right table presents in WHERE clause.
– Akina
Oct 19 '18 at 5:27
2
Your where condition on the outer joined table turns the outer join back into an inner join
– a_horse_with_no_name
Oct 19 '18 at 6:40
DBMS engines are not smart enough to fetch only required entries. Bad news is that nobody is smart enough. Mankind is not able to operate
sets
at all. All we can do is to emulate thesets
bylists
and perform thesearch
instead offetch
. Therefore the correct question should sound like that: "Why this query over that tables with that indexing and that cardinalities is slow?"– Kondybas
Oct 19 '18 at 12:01
What do you want to happen when
user.gender_id = 3
and there is no corresponding row in the tablegender
.– Rick James
Oct 22 '18 at 4:16