Optimizing table/indices for getting the latest row (filtering by one additional column)
I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:
[ConfigurationID] INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL
The TimestampUtc
will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationID
s there will be lots of rows, for some very few, and a new row for any OrganizationID
may be INSERTed at any time.
If needed, I can guarantee uniqueness of TimestampUtc
(but it would be great to have a solution that didn't need that).
INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).
My goals are:
- Getting the
ConfigurationData
with the latestTimestampUtc
for a givenOrganizationID
should be extremely fast regardless of the size of the table - INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on
OrganizationID ASC, TimestampUtc DESC
is probably not a great idea).
Questions
I know I can denormalize and just store the latest ConfigurationData
in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)
sql-server database-design sql-server-2012 performance index
bumped to the homepage by Community♦ 26 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
add a comment |
I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:
[ConfigurationID] INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL
The TimestampUtc
will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationID
s there will be lots of rows, for some very few, and a new row for any OrganizationID
may be INSERTed at any time.
If needed, I can guarantee uniqueness of TimestampUtc
(but it would be great to have a solution that didn't need that).
INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).
My goals are:
- Getting the
ConfigurationData
with the latestTimestampUtc
for a givenOrganizationID
should be extremely fast regardless of the size of the table - INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on
OrganizationID ASC, TimestampUtc DESC
is probably not a great idea).
Questions
I know I can denormalize and just store the latest ConfigurationData
in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)
sql-server database-design sql-server-2012 performance index
bumped to the homepage by Community♦ 26 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Which version and edition of sql server are you using?
– SqlZim
Feb 16 '17 at 17:06
SQL Server 2012 SP3 Standard, I've added this to the question
– Eugene O
Feb 16 '17 at 17:13
4
Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.
– Aaron Bertrand♦
Feb 16 '17 at 17:42
With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?
– Eugene O
Feb 16 '17 at 18:47
Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…
– RLF
Feb 16 '17 at 22:05
add a comment |
I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:
[ConfigurationID] INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL
The TimestampUtc
will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationID
s there will be lots of rows, for some very few, and a new row for any OrganizationID
may be INSERTed at any time.
If needed, I can guarantee uniqueness of TimestampUtc
(but it would be great to have a solution that didn't need that).
INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).
My goals are:
- Getting the
ConfigurationData
with the latestTimestampUtc
for a givenOrganizationID
should be extremely fast regardless of the size of the table - INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on
OrganizationID ASC, TimestampUtc DESC
is probably not a great idea).
Questions
I know I can denormalize and just store the latest ConfigurationData
in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)
sql-server database-design sql-server-2012 performance index
I have a SQL Server table (SQL Server 2012 SP3 Standard edition) that stores a bunch of configuration information (basically text blobs) for different organizations. The schema is something like this:
[ConfigurationID] INT IDENTITY (1,1) NOT NULL,
[OrganizationID] INT NOT NULL,
[TimestampUtc] DATETIME NOT NULL,
[ConfigurationData] NVARCHAR (MAX) NOT NULL,
[ChangedBy] NVARCHAR (256) NOT NULL,
[Comment] NVARCHAR (MAX) NOT NULL,
[ChangeType] INT NOT NULL
The TimestampUtc
will always be increasing (I won't ever be INSERTing "back-dated" entries into the table), and the rows won't ever be UPDATEd (I'm only INSERTing new rows). For some OrganizationID
s there will be lots of rows, for some very few, and a new row for any OrganizationID
may be INSERTed at any time.
If needed, I can guarantee uniqueness of TimestampUtc
(but it would be great to have a solution that didn't need that).
INSERTs are relatively rare (at most dozens of times per day, but typically much less than that), reads are very frequent (essentially on every web request to my application).
My goals are:
- Getting the
ConfigurationData
with the latestTimestampUtc
for a givenOrganizationID
should be extremely fast regardless of the size of the table - INSERT performance doesn't matter too much, but I'd like to avoid horrible index fragmentation if at all possible (so my first idea of a unique clustered index on
OrganizationID ASC, TimestampUtc DESC
is probably not a great idea).
Questions
I know I can denormalize and just store the latest ConfigurationData
in one table and the historical log of previous values in another table, but is it possible to meet my goals with just one table? What's the best way to do it? (I.e. what's the best index structure? do I need to change anything about the table schema, etc.?)
sql-server database-design sql-server-2012 performance index
sql-server database-design sql-server-2012 performance index
edited Feb 16 '17 at 18:31
MDCCL
6,85331745
6,85331745
asked Feb 16 '17 at 16:54
Eugene OEugene O
1263
1263
bumped to the homepage by Community♦ 26 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
bumped to the homepage by Community♦ 26 mins ago
This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Which version and edition of sql server are you using?
– SqlZim
Feb 16 '17 at 17:06
SQL Server 2012 SP3 Standard, I've added this to the question
– Eugene O
Feb 16 '17 at 17:13
4
Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.
– Aaron Bertrand♦
Feb 16 '17 at 17:42
With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?
– Eugene O
Feb 16 '17 at 18:47
Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…
– RLF
Feb 16 '17 at 22:05
add a comment |
Which version and edition of sql server are you using?
– SqlZim
Feb 16 '17 at 17:06
SQL Server 2012 SP3 Standard, I've added this to the question
– Eugene O
Feb 16 '17 at 17:13
4
Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.
– Aaron Bertrand♦
Feb 16 '17 at 17:42
With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?
– Eugene O
Feb 16 '17 at 18:47
Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…
– RLF
Feb 16 '17 at 22:05
Which version and edition of sql server are you using?
– SqlZim
Feb 16 '17 at 17:06
Which version and edition of sql server are you using?
– SqlZim
Feb 16 '17 at 17:06
SQL Server 2012 SP3 Standard, I've added this to the question
– Eugene O
Feb 16 '17 at 17:13
SQL Server 2012 SP3 Standard, I've added this to the question
– Eugene O
Feb 16 '17 at 17:13
4
4
Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.
– Aaron Bertrand♦
Feb 16 '17 at 17:42
Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.
– Aaron Bertrand♦
Feb 16 '17 at 17:42
With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?
– Eugene O
Feb 16 '17 at 18:47
With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?
– Eugene O
Feb 16 '17 at 18:47
Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…
– RLF
Feb 16 '17 at 22:05
Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…
– RLF
Feb 16 '17 at 22:05
add a comment |
2 Answers
2
active
oldest
votes
Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.
Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.
Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.
Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.
add a comment |
If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.
CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1
add a comment |
Your Answer
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f164568%2foptimizing-table-indices-for-getting-the-latest-row-filtering-by-one-additional%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.
Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.
Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.
Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.
add a comment |
Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.
Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.
Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.
Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.
add a comment |
Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.
Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.
Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.
Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.
Given the low rate of inserts, your proposed index will be completely fine, and perfect for the usage goal.
Given a fresh index with fill factor 100%, and sufficient history for each organisation to fill a page, there will be a page split on the first subsequent insert for each organisation. But then there won't be another page split for that organisation until the new page is filled up.
Even these splits and fragmentation can be mitigated by starting with a <100 fill factor, and regular reorganise.
Aaron's thought regarding using non-clustered was probably because that would allow you to cluster on the always ascending identity column, ensuring the only splits and fragmentation you will get is in a very compact separate index. But I suspect he was only mentioning that in the context of you wanting very much to avoid fragmentation, and not as something really necessary in this scenario.
answered Feb 17 '17 at 15:18
T.H.T.H.
81536
81536
add a comment |
add a comment |
If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.
CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1
add a comment |
If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.
CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1
add a comment |
If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.
CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1
If you alter your table and add a flag for the active records you can use a filtered index, in this case a unique on OrganizationID where IsActive=1. Of course you had to update the currently active record for an Organization before you can insert the new active entry.
CREATE UNIQUE INDEX UI_test ON dbo.test (OrganizationID) WHERE IsActive=1
answered Jun 16 '17 at 11:03
Stefan WilmsStefan Wilms
954
954
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f164568%2foptimizing-table-indices-for-getting-the-latest-row-filtering-by-one-additional%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Which version and edition of sql server are you using?
– SqlZim
Feb 16 '17 at 17:06
SQL Server 2012 SP3 Standard, I've added this to the question
– Eugene O
Feb 16 '17 at 17:13
4
Why do you place index fragmentation in such high regard? The index you describe sounds perfect (though if you are going after a specific OrgID, I wonder why that wouldn't be a non-clustered index, or why it has to be unique, since timestamps aren't guaranteed to be unique). Worry about fragmentation when it actually proves to be an obvious performance problem. Until then, design for workload performance, not to avoid potential fragmentation goblins.
– Aaron Bertrand♦
Feb 16 '17 at 17:42
With my proposed index, wouldn't it add a whole new page on literally every insert? Is this not something I should be worried about? Also, why would you recommend a non-clustered index?
– Eugene O
Feb 16 '17 at 18:47
Will your two NVARCHAR(MAX) columns potentially be very long? I assume that if so, you will not be indexing those columns. Check the discussion on NVARCHAR(MAX) at: stackoverflow.com/questions/148398/…
– RLF
Feb 16 '17 at 22:05