Index for column with two possible value












0














I use PostgreSQL 9.4.



I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?



If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?



SQL for index:



CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");


My query:



SELECT 
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';


Rows in the table: 64982



Thanks.










share|improve this question
















bumped to the homepage by Community 20 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
    – Vérace
    May 4 '16 at 11:17












  • For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
    – a_horse_with_no_name
    May 4 '16 at 11:19










  • Updated my post.
    – kunashir
    May 4 '16 at 11:31










  • Is your data skewed?
    – Vérace
    May 4 '16 at 11:52










  • @Vérace, not very much: male - 39973, female - 24346.
    – kunashir
    May 4 '16 at 11:58


















0














I use PostgreSQL 9.4.



I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?



If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?



SQL for index:



CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");


My query:



SELECT 
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';


Rows in the table: 64982



Thanks.










share|improve this question
















bumped to the homepage by Community 20 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.















  • Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
    – Vérace
    May 4 '16 at 11:17












  • For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
    – a_horse_with_no_name
    May 4 '16 at 11:19










  • Updated my post.
    – kunashir
    May 4 '16 at 11:31










  • Is your data skewed?
    – Vérace
    May 4 '16 at 11:52










  • @Vérace, not very much: male - 39973, female - 24346.
    – kunashir
    May 4 '16 at 11:58
















0












0








0







I use PostgreSQL 9.4.



I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?



If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?



SQL for index:



CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");


My query:



SELECT 
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';


Rows in the table: 64982



Thanks.










share|improve this question















I use PostgreSQL 9.4.



I have a database where a table have index in column which can contains two possible value only(gender: male/female). When I execute a query with condition on this column index was not use.
It's bad idea use index in such column?



If index useless in this case, will I have update for this index when add new record. In other words: Will I have increase insert speed if I remove this index?



SQL for index:



CREATE INDEX index_participations_on_gender
ON participations
USING btree
(gender COLLATE pg_catalog."default");


My query:



SELECT 
participations.last_name,
participations.first_name,
participations.birthday,
participations."number",
participations.chip_id,
participations.gender
FROM
public.participations
WHERE
participations.gender = 'male';


Rows in the table: 64982



Thanks.







postgresql index






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 4 '16 at 11:27







kunashir

















asked May 4 '16 at 10:58









kunashirkunashir

1033




1033





bumped to the homepage by Community 20 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 20 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.














  • Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
    – Vérace
    May 4 '16 at 11:17












  • For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
    – a_horse_with_no_name
    May 4 '16 at 11:19










  • Updated my post.
    – kunashir
    May 4 '16 at 11:31










  • Is your data skewed?
    – Vérace
    May 4 '16 at 11:52










  • @Vérace, not very much: male - 39973, female - 24346.
    – kunashir
    May 4 '16 at 11:58




















  • Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
    – Vérace
    May 4 '16 at 11:17












  • For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
    – a_horse_with_no_name
    May 4 '16 at 11:19










  • Updated my post.
    – kunashir
    May 4 '16 at 11:31










  • Is your data skewed?
    – Vérace
    May 4 '16 at 11:52










  • @Vérace, not very much: male - 39973, female - 24346.
    – kunashir
    May 4 '16 at 11:58


















Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17






Not enough info. What sort of index? How many rcords? Skew - i.e. how many males/females? Your record structre DML. Your query.
– Vérace
May 4 '16 at 11:17














For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19




For a condition that would select half of the rows an index lookup won't make things faster. And please read: wiki.postgresql.org/wiki/SlowQueryQuestions then edit your question and add the missing information
– a_horse_with_no_name
May 4 '16 at 11:19












Updated my post.
– kunashir
May 4 '16 at 11:31




Updated my post.
– kunashir
May 4 '16 at 11:31












Is your data skewed?
– Vérace
May 4 '16 at 11:52




Is your data skewed?
– Vérace
May 4 '16 at 11:52












@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58






@Vérace, not very much: male - 39973, female - 24346.
– kunashir
May 4 '16 at 11:58












2 Answers
2






active

oldest

votes


















0














The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)



The index is still getting updated regardless of its usefulness in queries.






share|improve this answer





























    0














    The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,



    Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.



    On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.






    share|improve this answer





















    • Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
      – kunashir
      May 4 '16 at 11:41










    • Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
      – Ste Bov
      May 4 '16 at 11:47











    Your Answer








    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "182"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137490%2findex-for-column-with-two-possible-value%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)



    The index is still getting updated regardless of its usefulness in queries.






    share|improve this answer


























      0














      The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)



      The index is still getting updated regardless of its usefulness in queries.






      share|improve this answer
























        0












        0








        0






        The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)



        The index is still getting updated regardless of its usefulness in queries.






        share|improve this answer












        The optimizer evaluates the possible gains of using indexes vs doing a full scan (filtering unwanted rows on the fly). As the ratio of filtered and total rows gets closer to 1 the benefit of using index decreases. The exact tipping point is dependent on the actual data, query, etc. so it's hard to say an exact number when it becomes useless. Generally speaking a 50/50 split is not sufficient for a B+TREE index to work efficiently and most cases full scan will be preferred over using indexes. (Unless it can be used for index-only scans)



        The index is still getting updated regardless of its usefulness in queries.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered May 4 '16 at 11:26









        Károly NagyKároly Nagy

        2,5201611




        2,5201611

























            0














            The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,



            Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.



            On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.






            share|improve this answer





















            • Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
              – kunashir
              May 4 '16 at 11:41










            • Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
              – Ste Bov
              May 4 '16 at 11:47
















            0














            The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,



            Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.



            On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.






            share|improve this answer





















            • Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
              – kunashir
              May 4 '16 at 11:41










            • Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
              – Ste Bov
              May 4 '16 at 11:47














            0












            0








            0






            The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,



            Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.



            On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.






            share|improve this answer












            The only query this would really benefit would be counting the male / female ratio. examples of queries your trying to run would be useful to give examples of what would help. the issue you have is an index that is basically a bit is that you put half of your data in one half, and half in the other half, there isn't actually any organisation to that index, it is barely any better than a heap,



            Consider your table as a multiple deck of cards(distinguishable by different backs), and you're looking for the 6 of hearts, the fact that its red(female) only helps you find all the red cards in those decks, from there, you still need hearts, and the fact that its a six, and which colour back it is (address = colour of back, age = number) both those other factors will narrow down your search far more than is it red or black.



            On the note of speed increase, technically yes your speed for inserting / deleting will increase (update shouldn't as you should very very rarely change someone's gender) the increase in speed will be minimal however since there isn't really any sorting going on to the index. I imagine the index is quite small since if it is M/F then every record is one character and the lookup to the main data.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered May 4 '16 at 11:26









            Ste BovSte Bov

            1,548411




            1,548411












            • Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
              – kunashir
              May 4 '16 at 11:41










            • Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
              – Ste Bov
              May 4 '16 at 11:47


















            • Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
              – kunashir
              May 4 '16 at 11:41










            • Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
              – Ste Bov
              May 4 '16 at 11:47
















            Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
            – kunashir
            May 4 '16 at 11:41




            Thank you for your explanation. I try next query Select participants.* FROM public.participations WHERE participations.gender = 'male' AND participations.number = 10, index still not use.
            – kunashir
            May 4 '16 at 11:41












            Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
            – Ste Bov
            May 4 '16 at 11:47




            Since the index is purely on the .gender then it will need to do a key lookup to the another index(or data) to pull out the .number value, at this point it would possibly easier to look for the .number then work out which of those remaining are male, even if that means looking over a full table scan it will more than likely be more efficient than finding half the values in your table and then running a keylookup to each of them, (using a deck of cards again) take all the red cards and you have 26 cards to find the 10s, if you take the 10s out, then you only have 4 cards to find the reds
            – Ste Bov
            May 4 '16 at 11:47


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Database Administrators Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f137490%2findex-for-column-with-two-possible-value%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Liste der Baudenkmale in Friedland (Mecklenburg)

            Single-Malt-Whisky

            Czorneboh