Optimizing “random” choice in PostgreSQL

I have the following table

CREATE TABLE article (

    id bigserial PRIMARY KEY,

    text TEXT NOT NULL,

    category_id bigint REFERENCES categories (id),

    status TEXT NOT NULL,

    locale TEXT NOT NULL,

    source TEXT NOT NULL

)

from where I want to select, say, 10 random records from as many distinct categories as possible (hence the ORDER BY rn ASC clause in the CTE). I'm using the following CTE to achieve this:

WITH "partitioned_articles" AS

(

    SELECT

        *,

        ROW_NUMBER() OVER (PARTITION BY "articles"."category_id" 

        ORDER BY RANDOM() ASC) AS "rn"

    FROM

        "articles"

    WHERE

        "articles"."status" = 'PUBLISHED' AND 

        LOWER("articles"."locale") = LOWER('en_US') AND 

        "articles"."source" = 'WIKIPEDIA'

    ORDER BY

        rn ASC

)

SELECT

    *

FROM

    "partitioned_articles"

LIMIT

    10

EXPLAIN ANALYZE originally gave me the following output:

Limit  (cost=45587.17..45587.37 rows=10 width=473) (actual time=795.552..795.557 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=45586.04..45587.17 rows=452 width=306) (actual time=795.545..795.547 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21984kB

          ->  WindowAgg  (cost=45555.93..45566.10 rows=452 width=306) (actual time=574.822..673.524 rows=89190 loops=1)

                ->  Sort  (cost=45555.93..45557.06 rows=452 width=306) (actual time=574.808..623.847 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Seq Scan on articles  (cost=0.00..45536.00 rows=452 width=306) (actual time=0.190..364.287 rows=89190 loops=1)

                            Filter: ((status = 'PUBLISHED'::text) AND (source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 310810

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=795.550..795.554 rows=10 loops=1)

Planning time: 0.462 ms

Execution time: 805.519 ms

I tried adding

CREATE INDEX articles__locale ON articles (locale);

CREATE INDEX articles__source ON articles (source);

CREATE INDEX articles__status ON articles (status);

and ended up with the following execution plan:

Limit  (cost=43393.70..43393.90 rows=10 width=473) (actual time=686.982..686.991 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=43392.57..43393.70 rows=452 width=306) (actual time=686.976..686.978 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21976kB

          ->  WindowAgg  (cost=43362.47..43372.64 rows=452 width=306) (actual time=462.080..562.038 rows=89190 loops=1)

                ->  Sort  (cost=43362.47..43363.60 rows=452 width=306) (actual time=462.064..511.409 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Bitmap Heap Scan on articles  (cost=3102.54..43342.54 rows=452 width=306) (actual time=35.149..216.145 rows=89190 loops=1)

                            Recheck Cond: (status = 'PUBLISHED'::text)

                            Filter: ((source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 44388

                            Heap Blocks: exact=12897

                            ->  Bitmap Index Scan on articles__status  (cost=0.00..3102.42 rows=135200 width=0) (actual time=30.421..30.421 rows=133578 loops=1)

                                  Index Cond: (status = 'PUBLISHED'::text)

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=686.981..686.988 rows=10 loops=1)

Planning time: 0.621 ms

Execution time: 696.945 ms

which, from what I understand, is only marginally better. Is there anything else I can do to further optimize this query? I'm using PostgreSQL 9.4 (planning on migrating to 9.5 as soon as possible, but unfortunately it is still not an option) and the articles table has currently about 400,000 records.

PS - These names have been made up for the example, I'm not really scrapping anything from Wikipedia or from anywhere else ;-)

edited Mar 7 '16 at 15:30

Vérace

16k33350

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

bumped to the homepage by Community♦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

What does "fictional" mean?

– ypercubeᵀᴹ
Mar 7 '16 at 12:36

1

I think using 9.5's tablesample could speed this up. Especially together with the tsm_system_rows extension: postgresql.org/docs/current/static/tsm-system-rows.html

– a_horse_with_no_name
Mar 7 '16 at 12:36

"Fictional" means that the names are fictional (made up for this example).

– Tony E. Stark
Mar 7 '16 at 12:46

@dezso I've edited the question to include the output of EXPLAIN ANALYZE.

– Tony E. Stark
Mar 7 '16 at 12:52

2

See here and here and here

– a_horse_with_no_name
Mar 7 '16 at 12:56

|
show 5 more comments

I have the following table

CREATE TABLE article (

    id bigserial PRIMARY KEY,

    text TEXT NOT NULL,

    category_id bigint REFERENCES categories (id),

    status TEXT NOT NULL,

    locale TEXT NOT NULL,

    source TEXT NOT NULL

)

from where I want to select, say, 10 random records from as many distinct categories as possible (hence the ORDER BY rn ASC clause in the CTE). I'm using the following CTE to achieve this:

WITH "partitioned_articles" AS

(

    SELECT

        *,

        ROW_NUMBER() OVER (PARTITION BY "articles"."category_id" 

        ORDER BY RANDOM() ASC) AS "rn"

    FROM

        "articles"

    WHERE

        "articles"."status" = 'PUBLISHED' AND 

        LOWER("articles"."locale") = LOWER('en_US') AND 

        "articles"."source" = 'WIKIPEDIA'

    ORDER BY

        rn ASC

)

SELECT

    *

FROM

    "partitioned_articles"

LIMIT

    10

EXPLAIN ANALYZE originally gave me the following output:

Limit  (cost=45587.17..45587.37 rows=10 width=473) (actual time=795.552..795.557 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=45586.04..45587.17 rows=452 width=306) (actual time=795.545..795.547 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21984kB

          ->  WindowAgg  (cost=45555.93..45566.10 rows=452 width=306) (actual time=574.822..673.524 rows=89190 loops=1)

                ->  Sort  (cost=45555.93..45557.06 rows=452 width=306) (actual time=574.808..623.847 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Seq Scan on articles  (cost=0.00..45536.00 rows=452 width=306) (actual time=0.190..364.287 rows=89190 loops=1)

                            Filter: ((status = 'PUBLISHED'::text) AND (source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 310810

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=795.550..795.554 rows=10 loops=1)

Planning time: 0.462 ms

Execution time: 805.519 ms

I tried adding

CREATE INDEX articles__locale ON articles (locale);

CREATE INDEX articles__source ON articles (source);

CREATE INDEX articles__status ON articles (status);

and ended up with the following execution plan:

Limit  (cost=43393.70..43393.90 rows=10 width=473) (actual time=686.982..686.991 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=43392.57..43393.70 rows=452 width=306) (actual time=686.976..686.978 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21976kB

          ->  WindowAgg  (cost=43362.47..43372.64 rows=452 width=306) (actual time=462.080..562.038 rows=89190 loops=1)

                ->  Sort  (cost=43362.47..43363.60 rows=452 width=306) (actual time=462.064..511.409 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Bitmap Heap Scan on articles  (cost=3102.54..43342.54 rows=452 width=306) (actual time=35.149..216.145 rows=89190 loops=1)

                            Recheck Cond: (status = 'PUBLISHED'::text)

                            Filter: ((source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 44388

                            Heap Blocks: exact=12897

                            ->  Bitmap Index Scan on articles__status  (cost=0.00..3102.42 rows=135200 width=0) (actual time=30.421..30.421 rows=133578 loops=1)

                                  Index Cond: (status = 'PUBLISHED'::text)

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=686.981..686.988 rows=10 loops=1)

Planning time: 0.621 ms

Execution time: 696.945 ms

PS - These names have been made up for the example, I'm not really scrapping anything from Wikipedia or from anywhere else ;-)

edited Mar 7 '16 at 15:30

Vérace

16k33350

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

bumped to the homepage by Community♦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

What does "fictional" mean?

– ypercubeᵀᴹ
Mar 7 '16 at 12:36

1

I think using 9.5's tablesample could speed this up. Especially together with the tsm_system_rows extension: postgresql.org/docs/current/static/tsm-system-rows.html

– a_horse_with_no_name
Mar 7 '16 at 12:36

"Fictional" means that the names are fictional (made up for this example).

– Tony E. Stark
Mar 7 '16 at 12:46

@dezso I've edited the question to include the output of EXPLAIN ANALYZE.

– Tony E. Stark
Mar 7 '16 at 12:52

2

See here and here and here

– a_horse_with_no_name
Mar 7 '16 at 12:56

|
show 5 more comments

I have the following table

CREATE TABLE article (

    id bigserial PRIMARY KEY,

    text TEXT NOT NULL,

    category_id bigint REFERENCES categories (id),

    status TEXT NOT NULL,

    locale TEXT NOT NULL,

    source TEXT NOT NULL

)

from where I want to select, say, 10 random records from as many distinct categories as possible (hence the ORDER BY rn ASC clause in the CTE). I'm using the following CTE to achieve this:

WITH "partitioned_articles" AS

(

    SELECT

        *,

        ROW_NUMBER() OVER (PARTITION BY "articles"."category_id" 

        ORDER BY RANDOM() ASC) AS "rn"

    FROM

        "articles"

    WHERE

        "articles"."status" = 'PUBLISHED' AND 

        LOWER("articles"."locale") = LOWER('en_US') AND 

        "articles"."source" = 'WIKIPEDIA'

    ORDER BY

        rn ASC

)

SELECT

    *

FROM

    "partitioned_articles"

LIMIT

    10

EXPLAIN ANALYZE originally gave me the following output:

Limit  (cost=45587.17..45587.37 rows=10 width=473) (actual time=795.552..795.557 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=45586.04..45587.17 rows=452 width=306) (actual time=795.545..795.547 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21984kB

          ->  WindowAgg  (cost=45555.93..45566.10 rows=452 width=306) (actual time=574.822..673.524 rows=89190 loops=1)

                ->  Sort  (cost=45555.93..45557.06 rows=452 width=306) (actual time=574.808..623.847 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Seq Scan on articles  (cost=0.00..45536.00 rows=452 width=306) (actual time=0.190..364.287 rows=89190 loops=1)

                            Filter: ((status = 'PUBLISHED'::text) AND (source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 310810

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=795.550..795.554 rows=10 loops=1)

Planning time: 0.462 ms

Execution time: 805.519 ms

I tried adding

CREATE INDEX articles__locale ON articles (locale);

CREATE INDEX articles__source ON articles (source);

CREATE INDEX articles__status ON articles (status);

and ended up with the following execution plan:

Limit  (cost=43393.70..43393.90 rows=10 width=473) (actual time=686.982..686.991 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=43392.57..43393.70 rows=452 width=306) (actual time=686.976..686.978 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21976kB

          ->  WindowAgg  (cost=43362.47..43372.64 rows=452 width=306) (actual time=462.080..562.038 rows=89190 loops=1)

                ->  Sort  (cost=43362.47..43363.60 rows=452 width=306) (actual time=462.064..511.409 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Bitmap Heap Scan on articles  (cost=3102.54..43342.54 rows=452 width=306) (actual time=35.149..216.145 rows=89190 loops=1)

                            Recheck Cond: (status = 'PUBLISHED'::text)

                            Filter: ((source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 44388

                            Heap Blocks: exact=12897

                            ->  Bitmap Index Scan on articles__status  (cost=0.00..3102.42 rows=135200 width=0) (actual time=30.421..30.421 rows=133578 loops=1)

                                  Index Cond: (status = 'PUBLISHED'::text)

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=686.981..686.988 rows=10 loops=1)

Planning time: 0.621 ms

Execution time: 696.945 ms

PS - These names have been made up for the example, I'm not really scrapping anything from Wikipedia or from anywhere else ;-)

edited Mar 7 '16 at 15:30

Vérace

16k33350

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

I have the following table

CREATE TABLE article (

    id bigserial PRIMARY KEY,

    text TEXT NOT NULL,

    category_id bigint REFERENCES categories (id),

    status TEXT NOT NULL,

    locale TEXT NOT NULL,

    source TEXT NOT NULL

)

from where I want to select, say, 10 random records from as many distinct categories as possible (hence the ORDER BY rn ASC clause in the CTE). I'm using the following CTE to achieve this:

WITH "partitioned_articles" AS

(

    SELECT

        *,

        ROW_NUMBER() OVER (PARTITION BY "articles"."category_id" 

        ORDER BY RANDOM() ASC) AS "rn"

    FROM

        "articles"

    WHERE

        "articles"."status" = 'PUBLISHED' AND 

        LOWER("articles"."locale") = LOWER('en_US') AND 

        "articles"."source" = 'WIKIPEDIA'

    ORDER BY

        rn ASC

)

SELECT

    *

FROM

    "partitioned_articles"

LIMIT

    10

EXPLAIN ANALYZE originally gave me the following output:

Limit  (cost=45587.17..45587.37 rows=10 width=473) (actual time=795.552..795.557 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=45586.04..45587.17 rows=452 width=306) (actual time=795.545..795.547 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21984kB

          ->  WindowAgg  (cost=45555.93..45566.10 rows=452 width=306) (actual time=574.822..673.524 rows=89190 loops=1)

                ->  Sort  (cost=45555.93..45557.06 rows=452 width=306) (actual time=574.808..623.847 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Seq Scan on articles  (cost=0.00..45536.00 rows=452 width=306) (actual time=0.190..364.287 rows=89190 loops=1)

                            Filter: ((status = 'PUBLISHED'::text) AND (source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 310810

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=795.550..795.554 rows=10 loops=1)

Planning time: 0.462 ms

Execution time: 805.519 ms

I tried adding

CREATE INDEX articles__locale ON articles (locale);

CREATE INDEX articles__source ON articles (source);

CREATE INDEX articles__status ON articles (status);

and ended up with the following execution plan:

Limit  (cost=43393.70..43393.90 rows=10 width=473) (actual time=686.982..686.991 rows=10 loops=1)

  CTE partitioned_articles

    ->  Sort  (cost=43392.57..43393.70 rows=452 width=306) (actual time=686.976..686.978 rows=10 loops=1)

          Sort Key: (row_number() OVER (?))

          Sort Method: external merge  Disk: 21976kB

          ->  WindowAgg  (cost=43362.47..43372.64 rows=452 width=306) (actual time=462.080..562.038 rows=89190 loops=1)

                ->  Sort  (cost=43362.47..43363.60 rows=452 width=306) (actual time=462.064..511.409 rows=89190 loops=1)

                      Sort Key: articles.category_id, (random())

                      Sort Method: external merge  Disk: 20368kB

                      ->  Bitmap Heap Scan on articles  (cost=3102.54..43342.54 rows=452 width=306) (actual time=35.149..216.145 rows=89190 loops=1)

                            Recheck Cond: (status = 'PUBLISHED'::text)

                            Filter: ((source = 'WIKIPEDIA'::text) AND (lower(locale) = 'en_us'::text))

                            Rows Removed by Filter: 44388

                            Heap Blocks: exact=12897

                            ->  Bitmap Index Scan on articles__status  (cost=0.00..3102.42 rows=135200 width=0) (actual time=30.421..30.421 rows=133578 loops=1)

                                  Index Cond: (status = 'PUBLISHED'::text)

  ->  CTE Scan on partitioned_articles  (cost=0.00..9.04 rows=452 width=473) (actual time=686.981..686.988 rows=10 loops=1)

Planning time: 0.621 ms

Execution time: 696.945 ms

PS - These names have been made up for the example, I'm not really scrapping anything from Wikipedia or from anywhere else ;-)

postgresql postgresql-9.4 execution-plan cte random

edited Mar 7 '16 at 15:30

Vérace

16k33350

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

edited Mar 7 '16 at 15:30

Vérace

16k33350

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

edited Mar 7 '16 at 15:30

Vérace

16k33350

edited Mar 7 '16 at 15:30

Vérace

16k33350

edited Mar 7 '16 at 15:30

Vérace

16k33350

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

asked Mar 7 '16 at 12:25

Tony E. Stark

1115

bumped to the homepage by Community♦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 4 mins ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

What does "fictional" mean?

– ypercubeᵀᴹ
Mar 7 '16 at 12:36

1

I think using 9.5's tablesample could speed this up. Especially together with the tsm_system_rows extension: postgresql.org/docs/current/static/tsm-system-rows.html

– a_horse_with_no_name
Mar 7 '16 at 12:36

"Fictional" means that the names are fictional (made up for this example).

– Tony E. Stark
Mar 7 '16 at 12:46

@dezso I've edited the question to include the output of EXPLAIN ANALYZE.

– Tony E. Stark
Mar 7 '16 at 12:52

2

See here and here and here

– a_horse_with_no_name
Mar 7 '16 at 12:56

|
show 5 more comments

What does "fictional" mean?

– ypercubeᵀᴹ
Mar 7 '16 at 12:36

1

I think using 9.5's tablesample could speed this up. Especially together with the tsm_system_rows extension: postgresql.org/docs/current/static/tsm-system-rows.html

– a_horse_with_no_name
Mar 7 '16 at 12:36

"Fictional" means that the names are fictional (made up for this example).

– Tony E. Stark
Mar 7 '16 at 12:46

@dezso I've edited the question to include the output of EXPLAIN ANALYZE.

– Tony E. Stark
Mar 7 '16 at 12:52

2

See here and here and here

– a_horse_with_no_name
Mar 7 '16 at 12:56

What does "fictional" mean?

– ypercubeᵀᴹ
Mar 7 '16 at 12:36

I think using 9.5's tablesample could speed this up. Especially together with the tsm_system_rows extension: postgresql.org/docs/current/static/tsm-system-rows.html

– a_horse_with_no_name
Mar 7 '16 at 12:36

"Fictional" means that the names are fictional (made up for this example).

– Tony E. Stark
Mar 7 '16 at 12:46

@dezso I've edited the question to include the output of EXPLAIN ANALYZE.

– Tony E. Stark
Mar 7 '16 at 12:52

See here and here and here

– a_horse_with_no_name
Mar 7 '16 at 12:56

|
show 5 more comments

1 Answer
1

active

oldest

votes

Adding indexes isn't going to help much: ordering by a random value give the engine no choice but to scan the whole table and sort on disk^[1], whether the ordering is done directly in an ORDER BY clause or via a window function.

^{[1] though for small data sets it might not actually touch disk}

Without sampling functions being available there is very little you can do in a simple view. It might be possible for tables within integer keys to hack something efficient using a recursive CTE in MS SQL Server or similar, but CTEs are optimisation fences in Postgres so I doubt this would help here (without actually trying it, my gut reaction would be to assume it would perform far worse).

If you can use a stored procedure or business logic in another layer of your application it would be more efficient to read the highest and lowest IDs, then loop picking a random value until you have 10 that exist and select out those rows. This would not be suitable if your data has many large gaps in the IDs (i.e. if the sequence is more gap than used number).

A final option that may reduce the amount of IO is to select just that in the CTE ordering by random, then join to the base table to get the rest of the row. This way you will still scan the index and potentially write it to disk for sorting, but you wan't scan and write the whole table.

edited Mar 7 '16 at 17:25

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

Good catch, I had my SQL Server head on. Postgres' use of the word cluster relating to indexes is quite different. I've edited the response to remove the irrelevant detail.

– David Spillett
Mar 7 '16 at 17:28

add a comment |

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f131493%2foptimizing-random-choice-in-postgresql%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

^{[1] though for small data sets it might not actually touch disk}

edited Mar 7 '16 at 17:25

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

Good catch, I had my SQL Server head on. Postgres' use of the word cluster relating to indexes is quite different. I've edited the response to remove the irrelevant detail.

– David Spillett
Mar 7 '16 at 17:28

add a comment |

^{[1] though for small data sets it might not actually touch disk}

edited Mar 7 '16 at 17:25

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

Good catch, I had my SQL Server head on. Postgres' use of the word cluster relating to indexes is quite different. I've edited the response to remove the irrelevant detail.

– David Spillett
Mar 7 '16 at 17:28

add a comment |

^{[1] though for small data sets it might not actually touch disk}

edited Mar 7 '16 at 17:25

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

^{[1] though for small data sets it might not actually touch disk}

edited Mar 7 '16 at 17:25

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

edited Mar 7 '16 at 17:25

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

answered Mar 7 '16 at 13:47

David Spillett

22.2k23267

Good catch, I had my SQL Server head on. Postgres' use of the word cluster relating to indexes is quite different. I've edited the response to remove the irrelevant detail.

– David Spillett
Mar 7 '16 at 17:28

add a comment |

Good catch, I had my SQL Server head on. Postgres' use of the word cluster relating to indexes is quite different. I've edited the response to remove the irrelevant detail.

– David Spillett
Mar 7 '16 at 17:28

Good catch, I had my SQL Server head on. Postgres' use of the word cluster relating to indexes is quite different. I've edited the response to remove the irrelevant detail.

– David Spillett
Mar 7 '16 at 17:28

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vryjdfkk