Index for column with two possible value
I use PostgreSQL 9.4.
I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?
If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?
SQL for index:
CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");
My query:
SELECT
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';
Rows in the table: 64982
Thanks.
postgresql index
bumped to the homepage by Community♦ 20 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I use PostgreSQL 9.4.
I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?
If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?
SQL for index:
CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");
My query:
SELECT
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';
Rows in the table: 64982
Thanks.
postgresql index
bumped to the homepage by Community♦ 20 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17
For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19
Updated my post.
– kunashir
May 4 '16 at 11:31
Is your data skewed?
– Vérace
May 4 '16 at 11:52
@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58
add a comment |
I use PostgreSQL 9.4.
I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?
If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?
SQL for index:
CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");
My query:
SELECT
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';
Rows in the table: 64982
Thanks.
postgresql index
I use PostgreSQL 9.4.
I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?
If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?
SQL for index:
CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");
My query:
SELECT
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';
Rows in the table: 64982
Thanks.
postgresql index
postgresql index
edited May 4 '16 at 11:27
kunashir
asked May 4 '16 at 10:58
kunashirkunashir
1033
1033
bumped to the homepage by Community♦ 20 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 20 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17
For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19
Updated my post.
– kunashir
May 4 '16 at 11:31
Is your data skewed?
– Vérace
May 4 '16 at 11:52
@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58
add a comment |
Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17
For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19
Updated my post.
– kunashir
May 4 '16 at 11:31
Is your data skewed?
– Vérace
May 4 '16 at 11:52
@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58
Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17
Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17
For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19
For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19
Updated my post.
– kunashir
May 4 '16 at 11:31
Updated my post.
– kunashir
May 4 '16 at 11:31
Is your data skewed?
– Vérace
May 4 '16 at 11:52
Is your data skewed?
– Vérace
May 4 '16 at 11:52
@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58
@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58
add a comment |
2 Answers
2
active
oldest
votes
The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)
The index is still getting updated regardless of its usefulness in queries.
add a comment |
The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,
Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.
On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.
Thank you for your explanation. I try next querySelect participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.
– kunashir
May 4 '16 at 11:41
Since the index is purely on the.gender
then it will need to do a key lookup to the another index(or data) to pull out the.number
value, at this point it would possibly easier to look for the.number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
– Ste Bov
May 4 '16 at 11:47
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137490%2findex-for-column-with-two-possible-value%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)
The index is still getting updated regardless of its usefulness in queries.
add a comment |
The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)
The index is still getting updated regardless of its usefulness in queries.
add a comment |
The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)
The index is still getting updated regardless of its usefulness in queries.
The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)
The index is still getting updated regardless of its usefulness in queries.
answered May 4 '16 at 11:26
Károly NagyKároly Nagy
2,5201611
2,5201611
add a comment |
add a comment |
The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,
Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.
On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.
Thank you for your explanation. I try next querySelect participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.
– kunashir
May 4 '16 at 11:41
Since the index is purely on the.gender
then it will need to do a key lookup to the another index(or data) to pull out the.number
value, at this point it would possibly easier to look for the.number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
– Ste Bov
May 4 '16 at 11:47
add a comment |
The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,
Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.
On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.
Thank you for your explanation. I try next querySelect participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.
– kunashir
May 4 '16 at 11:41
Since the index is purely on the.gender
then it will need to do a key lookup to the another index(or data) to pull out the.number
value, at this point it would possibly easier to look for the.number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
– Ste Bov
May 4 '16 at 11:47
add a comment |
The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,
Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.
On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.
The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,
Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.
On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.
answered May 4 '16 at 11:26
Ste BovSte Bov
1,548411
1,548411
Thank you for your explanation. I try next querySelect participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.
– kunashir
May 4 '16 at 11:41
Since the index is purely on the.gender
then it will need to do a key lookup to the another index(or data) to pull out the.number
value, at this point it would possibly easier to look for the.number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
– Ste Bov
May 4 '16 at 11:47
add a comment |
Thank you for your explanation. I try next querySelect participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.
– kunashir
May 4 '16 at 11:41
Since the index is purely on the.gender
then it will need to do a key lookup to the another index(or data) to pull out the.number
value, at this point it would possibly easier to look for the.number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
– Ste Bov
May 4 '16 at 11:47
Thank you for your explanation. I try next query
Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.– kunashir
May 4 '16 at 11:41
Thank you for your explanation. I try next query
Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10
, index still not use.– kunashir
May 4 '16 at 11:41
Since the index is purely on the
.gender
then it will need to do a key lookup to the another index(or data) to pull out the .number
value, at this point it would possibly easier to look for the .number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds– Ste Bov
May 4 '16 at 11:47
Since the index is purely on the
.gender
then it will need to do a key lookup to the another index(or data) to pull out the .number
value, at this point it would possibly easier to look for the .number
then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds– Ste Bov
May 4 '16 at 11:47
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137490%2findex-for-column-with-two-possible-value%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17
For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19
Updated my post.
– kunashir
May 4 '16 at 11:31
Is your data skewed?
– Vérace
May 4 '16 at 11:52
@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58