Postgresql: How to store high-dimensional ( N &gt; 100) vectors and index for fast lookup by cosine...

Postgresql: How to store high-dimensional ( N > 100) vectors and index for fast lookup by cosine...

I am trying to store vectors for word/doc embeddings in a postgresql table, and want to be able to quickly pull the N rows with highest cosine similarity to a given query vector. The vectors I'm working with are numpy.arrays of floats with length 100 <= L <= 1000.

I looked into the cube module for similarity search, but it is limited to vectors with <= 100 dimensions. The embeddings I am using will result in vectors that are 100-dimensions minimum and often much higher (depending on settings when training word2vec/doc2vec models).

What is the most efficient way to store large dimensional vectors (numpy float arrays) in postgres, and perform quick lookup based on cosine similarity (or other vector similarity metrics)?

edited 2 mins ago

asked 4 hours ago

J. Taylor

132213

add a comment |

What is the most efficient way to store large dimensional vectors (numpy float arrays) in postgres, and perform quick lookup based on cosine similarity (or other vector similarity metrics)?

edited 2 mins ago

asked 4 hours ago

J. Taylor

132213

add a comment |

What is the most efficient way to store large dimensional vectors (numpy float arrays) in postgres, and perform quick lookup based on cosine similarity (or other vector similarity metrics)?

edited 2 mins ago

asked 4 hours ago

J. Taylor

132213

What is the most efficient way to store large dimensional vectors (numpy float arrays) in postgres, and perform quick lookup based on cosine similarity (or other vector similarity metrics)?

postgresql index array dimension

edited 2 mins ago

asked 4 hours ago

J. Taylor

132213

edited 2 mins ago

asked 4 hours ago

J. Taylor

132213

edited 2 mins ago

asked 4 hours ago

J. Taylor

132213

asked 4 hours ago

J. Taylor

132213

asked 4 hours ago

J. Taylor

132213

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f230443%2fpostgresql-how-to-store-high-dimensional-n-100-vectors-and-index-for-fast%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Database Administrators Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Vryjdfkk

Postgresql: How to store high-dimensional ( N > 100) vectors and index for fast lookup by cosine...

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

Ronny Ackermann

Köttigit

Christoph Wilhelm Mitscherlich