Can performance be improved by breaking databases up?












3















Performance is my over-riding concern, as I have a heavily recursive CTE that is the engine room of my application. This CTE may have to work on 4 columns of up to 7 million rows. Queries can take over 30 minutes with just 1.2 million rows.



To reduce the waiting time of most users, I save minimalist 2 column tables that capture the results of long-running CTE recursion queries into a separate "cache" database. Thus 1000s of these two column tables, some containing nearly 7 million rows collect in this database



Since both databases are dealing with subsets of the same basic records, I am wondering how best to store all 8-10 columns for each record so that the data is available to efficiently join with results from either database. I have to maintain concurrency with the base data in the CTE specific database as new records are added. At the moment I keep and upkeep the same data in both databases.



Is there a better way to do all of this? Should I just keep all the data in one database? If so, how will my long-running CTEs interact with lots of interruptions, and won't all users then have to wait more, as I suspect?



Recursive CTE:



USE [relationship]
GO
/****** Object: StoredProcedure [home].[proportion] Script Date: 2/24/2016 9:51:22 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

-- =============================================
-- Author: Me
-- Create date: today
-- Description: cte with normalized tbls
-- =============================================
ALTER PROCEDURE [home].[proportion]
-- Add the parameters for the stored procedure here
@id int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
DELETE FROM relationship.home.persistedTemp WHERE originator = @id;
WITH PctCTE(id, percent, kind)
AS
(SELECT individual, CONVERT(DECIMAL(28,25),100.0) AS percent, model
FROM relationship.home.relationshipRecords constituent
WHERE constituent.individual = @id
UNION ALL
SELECT derivation.individual, CONVERT(DECIMAL(28,25),constituent.percent/2.0), derivation.model
FROM relationship.home.relationshipRecords AS derivation
INNER JOIN PctCTE AS constituent
ON (constituent.kind = 'M' AND
(derivation.inputKindB = constituent.id))
OR
(NOT constituent.kind = 'M' AND
(derivation.inputKindA = constituent.id))
),
mergeCTE(i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning w
ON tmp.i = w.individual;
DELETE FROM relationship.home.persistedTemp WHERE originator = @id AND i = @id

END


Note: inputKindA may be used to create many individuals (1000s), inputKindB may be used to create less individuals (up to 20), and both often create no individuals.



The database associated with the CTE is called relationship and contains the following tables:



myids



record_id                   char(6)                     PK FK
reg nchar(10) PK
name nvarchar(60)
individual int FK U1
main int

FK_myids_names
IX_myids_1 individual(ASC) Unique
PK_myids record_id (ASC), reg (ASC) Unique Clustered


names



individual                  int                         PK {I}
name nvarchar(60)

PK_names individual (ASC) Unique Clustered


relationshipRecords



individual                  int                         FK
inputKindA int {I}
inputKindB int I2
model char(1) I3

FK_relationshipRecords_names
IX_relationshipRecords_B inputKindB (ASC)
IX_relationshipRecords_mdl model (ASC)
IX_relationshipRecords_id individual (ASC) Unique
IX_relationshipRecords_A inputKindA (ASC) Clustered


beginning



individual                  int                         FK
yr smallint {I}
color nvarchar(50)

FK_beginning_names
IX_beginning_culr color (ASC)
IX_beginning_id individual (ASC) Unique
IX_beginning_yr yr (ASC) Clustered


persistedTemp



originator                  int     
i int
p decimal(28,25)
y smallint

IX_persistedTemp originator (ASC) Clustered


and record_log



record_id                   char(6)                     PK
version_id char(3)
cntry_id char(2)
record_nm nvarchar(255)
notes nvarchar(max)
updtd int
base_yr smallint
user_id nvarchar(255)
dflt_id char(15)
dflt_nm nvarchar(255)
css_ref nvarchar(255)
url_ref nvarchar(255)
bt_link nvarchar(150)

PK_record_log record_id (ASC) Unique Clustered


Finally, the database that stores the cached long-query results has the following tables:



hinfo, which holds redundant data from the names, relationshipRecords, and beginning tables above:



individual                  int                         PK U1
reg nvarchar(255)
name nvarchar(60) U1
kinds nvarchar(121)
year smallint {I}
color nvarchar(50)
model char(1)

PK_hinfo individual (ASC) Unique
IX_hinfo year (ASC) Clustered
IX_nameorder individual (ASC), name (ASC) Unique


srch_log



individual                  int                         PK FK
last_id int
last_dt datetime2(7)
hit_cnt int
lapse int
saved bit

PK_srch_log individual (ASC) Unique Clustered


and mostly this database stores many of the following type of tables:



p000001



i                           int                         PK
p decimal(28,25) I1
PK_p000001 i (ASC) Unique
IX_p000001 p (ASC) Clustered


Clearly, the caching is a major pain, especially since they must often be updated when new individuals are added to the records. Ideally, the CTE would just run a whole lot faster.










share|improve this question
















bumped to the homepage by Community 30 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.











  • 1





    Separating the tables into different databases will have minimal benefit, especially if they are ending up almost identical in what data is being stored. The number of rows you're referencing is substantial enough to start seeing issues but not enough to be a definitive problem, but there is some potential that your CTE is poorly optimized. I assume you're using the recursive aspect of it? Also if you're creating thousands of small breadth but very deep tables that may be causing issues too. It's possible some query, table, and index redesign may fix your problems, but we would need more info.

    – Duffy
    Feb 24 '16 at 22:31











  • I guess I am applying some of my queueing theory background in thinking that I can in essence "add an express line" and thus reduce average wait times overall. I hope that adding the CTE code, and some of the database structure, to this question can help your analysis of my situation.

    – user88097
    Feb 29 '16 at 22:04
















3















Performance is my over-riding concern, as I have a heavily recursive CTE that is the engine room of my application. This CTE may have to work on 4 columns of up to 7 million rows. Queries can take over 30 minutes with just 1.2 million rows.



To reduce the waiting time of most users, I save minimalist 2 column tables that capture the results of long-running CTE recursion queries into a separate "cache" database. Thus 1000s of these two column tables, some containing nearly 7 million rows collect in this database



Since both databases are dealing with subsets of the same basic records, I am wondering how best to store all 8-10 columns for each record so that the data is available to efficiently join with results from either database. I have to maintain concurrency with the base data in the CTE specific database as new records are added. At the moment I keep and upkeep the same data in both databases.



Is there a better way to do all of this? Should I just keep all the data in one database? If so, how will my long-running CTEs interact with lots of interruptions, and won't all users then have to wait more, as I suspect?



Recursive CTE:



USE [relationship]
GO
/****** Object: StoredProcedure [home].[proportion] Script Date: 2/24/2016 9:51:22 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

-- =============================================
-- Author: Me
-- Create date: today
-- Description: cte with normalized tbls
-- =============================================
ALTER PROCEDURE [home].[proportion]
-- Add the parameters for the stored procedure here
@id int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
DELETE FROM relationship.home.persistedTemp WHERE originator = @id;
WITH PctCTE(id, percent, kind)
AS
(SELECT individual, CONVERT(DECIMAL(28,25),100.0) AS percent, model
FROM relationship.home.relationshipRecords constituent
WHERE constituent.individual = @id
UNION ALL
SELECT derivation.individual, CONVERT(DECIMAL(28,25),constituent.percent/2.0), derivation.model
FROM relationship.home.relationshipRecords AS derivation
INNER JOIN PctCTE AS constituent
ON (constituent.kind = 'M' AND
(derivation.inputKindB = constituent.id))
OR
(NOT constituent.kind = 'M' AND
(derivation.inputKindA = constituent.id))
),
mergeCTE(i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning w
ON tmp.i = w.individual;
DELETE FROM relationship.home.persistedTemp WHERE originator = @id AND i = @id

END


Note: inputKindA may be used to create many individuals (1000s), inputKindB may be used to create less individuals (up to 20), and both often create no individuals.



The database associated with the CTE is called relationship and contains the following tables:



myids



record_id                   char(6)                     PK FK
reg nchar(10) PK
name nvarchar(60)
individual int FK U1
main int

FK_myids_names
IX_myids_1 individual(ASC) Unique
PK_myids record_id (ASC), reg (ASC) Unique Clustered


names



individual                  int                         PK {I}
name nvarchar(60)

PK_names individual (ASC) Unique Clustered


relationshipRecords



individual                  int                         FK
inputKindA int {I}
inputKindB int I2
model char(1) I3

FK_relationshipRecords_names
IX_relationshipRecords_B inputKindB (ASC)
IX_relationshipRecords_mdl model (ASC)
IX_relationshipRecords_id individual (ASC) Unique
IX_relationshipRecords_A inputKindA (ASC) Clustered


beginning



individual                  int                         FK
yr smallint {I}
color nvarchar(50)

FK_beginning_names
IX_beginning_culr color (ASC)
IX_beginning_id individual (ASC) Unique
IX_beginning_yr yr (ASC) Clustered


persistedTemp



originator                  int     
i int
p decimal(28,25)
y smallint

IX_persistedTemp originator (ASC) Clustered


and record_log



record_id                   char(6)                     PK
version_id char(3)
cntry_id char(2)
record_nm nvarchar(255)
notes nvarchar(max)
updtd int
base_yr smallint
user_id nvarchar(255)
dflt_id char(15)
dflt_nm nvarchar(255)
css_ref nvarchar(255)
url_ref nvarchar(255)
bt_link nvarchar(150)

PK_record_log record_id (ASC) Unique Clustered


Finally, the database that stores the cached long-query results has the following tables:



hinfo, which holds redundant data from the names, relationshipRecords, and beginning tables above:



individual                  int                         PK U1
reg nvarchar(255)
name nvarchar(60) U1
kinds nvarchar(121)
year smallint {I}
color nvarchar(50)
model char(1)

PK_hinfo individual (ASC) Unique
IX_hinfo year (ASC) Clustered
IX_nameorder individual (ASC), name (ASC) Unique


srch_log



individual                  int                         PK FK
last_id int
last_dt datetime2(7)
hit_cnt int
lapse int
saved bit

PK_srch_log individual (ASC) Unique Clustered


and mostly this database stores many of the following type of tables:



p000001



i                           int                         PK
p decimal(28,25) I1
PK_p000001 i (ASC) Unique
IX_p000001 p (ASC) Clustered


Clearly, the caching is a major pain, especially since they must often be updated when new individuals are added to the records. Ideally, the CTE would just run a whole lot faster.










share|improve this question
















bumped to the homepage by Community 30 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.











  • 1





    Separating the tables into different databases will have minimal benefit, especially if they are ending up almost identical in what data is being stored. The number of rows you're referencing is substantial enough to start seeing issues but not enough to be a definitive problem, but there is some potential that your CTE is poorly optimized. I assume you're using the recursive aspect of it? Also if you're creating thousands of small breadth but very deep tables that may be causing issues too. It's possible some query, table, and index redesign may fix your problems, but we would need more info.

    – Duffy
    Feb 24 '16 at 22:31











  • I guess I am applying some of my queueing theory background in thinking that I can in essence "add an express line" and thus reduce average wait times overall. I hope that adding the CTE code, and some of the database structure, to this question can help your analysis of my situation.

    – user88097
    Feb 29 '16 at 22:04














3












3








3








Performance is my over-riding concern, as I have a heavily recursive CTE that is the engine room of my application. This CTE may have to work on 4 columns of up to 7 million rows. Queries can take over 30 minutes with just 1.2 million rows.



To reduce the waiting time of most users, I save minimalist 2 column tables that capture the results of long-running CTE recursion queries into a separate "cache" database. Thus 1000s of these two column tables, some containing nearly 7 million rows collect in this database



Since both databases are dealing with subsets of the same basic records, I am wondering how best to store all 8-10 columns for each record so that the data is available to efficiently join with results from either database. I have to maintain concurrency with the base data in the CTE specific database as new records are added. At the moment I keep and upkeep the same data in both databases.



Is there a better way to do all of this? Should I just keep all the data in one database? If so, how will my long-running CTEs interact with lots of interruptions, and won't all users then have to wait more, as I suspect?



Recursive CTE:



USE [relationship]
GO
/****** Object: StoredProcedure [home].[proportion] Script Date: 2/24/2016 9:51:22 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

-- =============================================
-- Author: Me
-- Create date: today
-- Description: cte with normalized tbls
-- =============================================
ALTER PROCEDURE [home].[proportion]
-- Add the parameters for the stored procedure here
@id int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
DELETE FROM relationship.home.persistedTemp WHERE originator = @id;
WITH PctCTE(id, percent, kind)
AS
(SELECT individual, CONVERT(DECIMAL(28,25),100.0) AS percent, model
FROM relationship.home.relationshipRecords constituent
WHERE constituent.individual = @id
UNION ALL
SELECT derivation.individual, CONVERT(DECIMAL(28,25),constituent.percent/2.0), derivation.model
FROM relationship.home.relationshipRecords AS derivation
INNER JOIN PctCTE AS constituent
ON (constituent.kind = 'M' AND
(derivation.inputKindB = constituent.id))
OR
(NOT constituent.kind = 'M' AND
(derivation.inputKindA = constituent.id))
),
mergeCTE(i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning w
ON tmp.i = w.individual;
DELETE FROM relationship.home.persistedTemp WHERE originator = @id AND i = @id

END


Note: inputKindA may be used to create many individuals (1000s), inputKindB may be used to create less individuals (up to 20), and both often create no individuals.



The database associated with the CTE is called relationship and contains the following tables:



myids



record_id                   char(6)                     PK FK
reg nchar(10) PK
name nvarchar(60)
individual int FK U1
main int

FK_myids_names
IX_myids_1 individual(ASC) Unique
PK_myids record_id (ASC), reg (ASC) Unique Clustered


names



individual                  int                         PK {I}
name nvarchar(60)

PK_names individual (ASC) Unique Clustered


relationshipRecords



individual                  int                         FK
inputKindA int {I}
inputKindB int I2
model char(1) I3

FK_relationshipRecords_names
IX_relationshipRecords_B inputKindB (ASC)
IX_relationshipRecords_mdl model (ASC)
IX_relationshipRecords_id individual (ASC) Unique
IX_relationshipRecords_A inputKindA (ASC) Clustered


beginning



individual                  int                         FK
yr smallint {I}
color nvarchar(50)

FK_beginning_names
IX_beginning_culr color (ASC)
IX_beginning_id individual (ASC) Unique
IX_beginning_yr yr (ASC) Clustered


persistedTemp



originator                  int     
i int
p decimal(28,25)
y smallint

IX_persistedTemp originator (ASC) Clustered


and record_log



record_id                   char(6)                     PK
version_id char(3)
cntry_id char(2)
record_nm nvarchar(255)
notes nvarchar(max)
updtd int
base_yr smallint
user_id nvarchar(255)
dflt_id char(15)
dflt_nm nvarchar(255)
css_ref nvarchar(255)
url_ref nvarchar(255)
bt_link nvarchar(150)

PK_record_log record_id (ASC) Unique Clustered


Finally, the database that stores the cached long-query results has the following tables:



hinfo, which holds redundant data from the names, relationshipRecords, and beginning tables above:



individual                  int                         PK U1
reg nvarchar(255)
name nvarchar(60) U1
kinds nvarchar(121)
year smallint {I}
color nvarchar(50)
model char(1)

PK_hinfo individual (ASC) Unique
IX_hinfo year (ASC) Clustered
IX_nameorder individual (ASC), name (ASC) Unique


srch_log



individual                  int                         PK FK
last_id int
last_dt datetime2(7)
hit_cnt int
lapse int
saved bit

PK_srch_log individual (ASC) Unique Clustered


and mostly this database stores many of the following type of tables:



p000001



i                           int                         PK
p decimal(28,25) I1
PK_p000001 i (ASC) Unique
IX_p000001 p (ASC) Clustered


Clearly, the caching is a major pain, especially since they must often be updated when new individuals are added to the records. Ideally, the CTE would just run a whole lot faster.










share|improve this question
















Performance is my over-riding concern, as I have a heavily recursive CTE that is the engine room of my application. This CTE may have to work on 4 columns of up to 7 million rows. Queries can take over 30 minutes with just 1.2 million rows.



To reduce the waiting time of most users, I save minimalist 2 column tables that capture the results of long-running CTE recursion queries into a separate "cache" database. Thus 1000s of these two column tables, some containing nearly 7 million rows collect in this database



Since both databases are dealing with subsets of the same basic records, I am wondering how best to store all 8-10 columns for each record so that the data is available to efficiently join with results from either database. I have to maintain concurrency with the base data in the CTE specific database as new records are added. At the moment I keep and upkeep the same data in both databases.



Is there a better way to do all of this? Should I just keep all the data in one database? If so, how will my long-running CTEs interact with lots of interruptions, and won't all users then have to wait more, as I suspect?



Recursive CTE:



USE [relationship]
GO
/****** Object: StoredProcedure [home].[proportion] Script Date: 2/24/2016 9:51:22 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

-- =============================================
-- Author: Me
-- Create date: today
-- Description: cte with normalized tbls
-- =============================================
ALTER PROCEDURE [home].[proportion]
-- Add the parameters for the stored procedure here
@id int
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

-- Insert statements for procedure here
DELETE FROM relationship.home.persistedTemp WHERE originator = @id;
WITH PctCTE(id, percent, kind)
AS
(SELECT individual, CONVERT(DECIMAL(28,25),100.0) AS percent, model
FROM relationship.home.relationshipRecords constituent
WHERE constituent.individual = @id
UNION ALL
SELECT derivation.individual, CONVERT(DECIMAL(28,25),constituent.percent/2.0), derivation.model
FROM relationship.home.relationshipRecords AS derivation
INNER JOIN PctCTE AS constituent
ON (constituent.kind = 'M' AND
(derivation.inputKindB = constituent.id))
OR
(NOT constituent.kind = 'M' AND
(derivation.inputKindA = constituent.id))
),
mergeCTE(i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning w
ON tmp.i = w.individual;
DELETE FROM relationship.home.persistedTemp WHERE originator = @id AND i = @id

END


Note: inputKindA may be used to create many individuals (1000s), inputKindB may be used to create less individuals (up to 20), and both often create no individuals.



The database associated with the CTE is called relationship and contains the following tables:



myids



record_id                   char(6)                     PK FK
reg nchar(10) PK
name nvarchar(60)
individual int FK U1
main int

FK_myids_names
IX_myids_1 individual(ASC) Unique
PK_myids record_id (ASC), reg (ASC) Unique Clustered


names



individual                  int                         PK {I}
name nvarchar(60)

PK_names individual (ASC) Unique Clustered


relationshipRecords



individual                  int                         FK
inputKindA int {I}
inputKindB int I2
model char(1) I3

FK_relationshipRecords_names
IX_relationshipRecords_B inputKindB (ASC)
IX_relationshipRecords_mdl model (ASC)
IX_relationshipRecords_id individual (ASC) Unique
IX_relationshipRecords_A inputKindA (ASC) Clustered


beginning



individual                  int                         FK
yr smallint {I}
color nvarchar(50)

FK_beginning_names
IX_beginning_culr color (ASC)
IX_beginning_id individual (ASC) Unique
IX_beginning_yr yr (ASC) Clustered


persistedTemp



originator                  int     
i int
p decimal(28,25)
y smallint

IX_persistedTemp originator (ASC) Clustered


and record_log



record_id                   char(6)                     PK
version_id char(3)
cntry_id char(2)
record_nm nvarchar(255)
notes nvarchar(max)
updtd int
base_yr smallint
user_id nvarchar(255)
dflt_id char(15)
dflt_nm nvarchar(255)
css_ref nvarchar(255)
url_ref nvarchar(255)
bt_link nvarchar(150)

PK_record_log record_id (ASC) Unique Clustered


Finally, the database that stores the cached long-query results has the following tables:



hinfo, which holds redundant data from the names, relationshipRecords, and beginning tables above:



individual                  int                         PK U1
reg nvarchar(255)
name nvarchar(60) U1
kinds nvarchar(121)
year smallint {I}
color nvarchar(50)
model char(1)

PK_hinfo individual (ASC) Unique
IX_hinfo year (ASC) Clustered
IX_nameorder individual (ASC), name (ASC) Unique


srch_log



individual                  int                         PK FK
last_id int
last_dt datetime2(7)
hit_cnt int
lapse int
saved bit

PK_srch_log individual (ASC) Unique Clustered


and mostly this database stores many of the following type of tables:



p000001



i                           int                         PK
p decimal(28,25) I1
PK_p000001 i (ASC) Unique
IX_p000001 p (ASC) Clustered


Clearly, the caching is a major pain, especially since they must often be updated when new individuals are added to the records. Ideally, the CTE would just run a whole lot faster.







sql-server performance-tuning recursive






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Feb 29 '16 at 21:53







user88097

















asked Feb 24 '16 at 22:13









user88097user88097

162




162





bumped to the homepage by Community 30 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.







bumped to the homepage by Community 30 mins ago


This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.










  • 1





    Separating the tables into different databases will have minimal benefit, especially if they are ending up almost identical in what data is being stored. The number of rows you're referencing is substantial enough to start seeing issues but not enough to be a definitive problem, but there is some potential that your CTE is poorly optimized. I assume you're using the recursive aspect of it? Also if you're creating thousands of small breadth but very deep tables that may be causing issues too. It's possible some query, table, and index redesign may fix your problems, but we would need more info.

    – Duffy
    Feb 24 '16 at 22:31











  • I guess I am applying some of my queueing theory background in thinking that I can in essence "add an express line" and thus reduce average wait times overall. I hope that adding the CTE code, and some of the database structure, to this question can help your analysis of my situation.

    – user88097
    Feb 29 '16 at 22:04














  • 1





    Separating the tables into different databases will have minimal benefit, especially if they are ending up almost identical in what data is being stored. The number of rows you're referencing is substantial enough to start seeing issues but not enough to be a definitive problem, but there is some potential that your CTE is poorly optimized. I assume you're using the recursive aspect of it? Also if you're creating thousands of small breadth but very deep tables that may be causing issues too. It's possible some query, table, and index redesign may fix your problems, but we would need more info.

    – Duffy
    Feb 24 '16 at 22:31











  • I guess I am applying some of my queueing theory background in thinking that I can in essence "add an express line" and thus reduce average wait times overall. I hope that adding the CTE code, and some of the database structure, to this question can help your analysis of my situation.

    – user88097
    Feb 29 '16 at 22:04








1




1





Separating the tables into different databases will have minimal benefit, especially if they are ending up almost identical in what data is being stored. The number of rows you're referencing is substantial enough to start seeing issues but not enough to be a definitive problem, but there is some potential that your CTE is poorly optimized. I assume you're using the recursive aspect of it? Also if you're creating thousands of small breadth but very deep tables that may be causing issues too. It's possible some query, table, and index redesign may fix your problems, but we would need more info.

– Duffy
Feb 24 '16 at 22:31





Separating the tables into different databases will have minimal benefit, especially if they are ending up almost identical in what data is being stored. The number of rows you're referencing is substantial enough to start seeing issues but not enough to be a definitive problem, but there is some potential that your CTE is poorly optimized. I assume you're using the recursive aspect of it? Also if you're creating thousands of small breadth but very deep tables that may be causing issues too. It's possible some query, table, and index redesign may fix your problems, but we would need more info.

– Duffy
Feb 24 '16 at 22:31













I guess I am applying some of my queueing theory background in thinking that I can in essence "add an express line" and thus reduce average wait times overall. I hope that adding the CTE code, and some of the database structure, to this question can help your analysis of my situation.

– user88097
Feb 29 '16 at 22:04





I guess I am applying some of my queueing theory background in thinking that I can in essence "add an express line" and thus reduce average wait times overall. I hope that adding the CTE code, and some of the database structure, to this question can help your analysis of my situation.

– user88097
Feb 29 '16 at 22:04










1 Answer
1






active

oldest

votes


















0














This is not an answer but this part:



WITH PctCTE(id, percent, kind, count)
AS
( ... ),
mergeCTE(dups, i, p)
AS
(SELECT ROW_NUMBER () OVER (PARTITION BY id ORDER BY count) 'dups',
id, SUM(percent) OVER (PARTITION BY id)
FROM PctCTE
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, when.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning when
ON tmp.i = when.individual
WHERE dups = 1;


can be simplified to:



WITH PctCTE (id, percent, kind, count)
AS
( ... ),
mergeCTE (i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning AS w
ON tmp.i = w.individual ;


If that's correct and I'm not missing something, then the recursive PctCTE can also be simplified. The count column seems to be used no where at all.






share|improve this answer


























  • Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

    – user88097
    Feb 29 '16 at 21:59













Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "182"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f130373%2fcan-performance-be-improved-by-breaking-databases-up%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














This is not an answer but this part:



WITH PctCTE(id, percent, kind, count)
AS
( ... ),
mergeCTE(dups, i, p)
AS
(SELECT ROW_NUMBER () OVER (PARTITION BY id ORDER BY count) 'dups',
id, SUM(percent) OVER (PARTITION BY id)
FROM PctCTE
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, when.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning when
ON tmp.i = when.individual
WHERE dups = 1;


can be simplified to:



WITH PctCTE (id, percent, kind, count)
AS
( ... ),
mergeCTE (i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning AS w
ON tmp.i = w.individual ;


If that's correct and I'm not missing something, then the recursive PctCTE can also be simplified. The count column seems to be used no where at all.






share|improve this answer


























  • Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

    – user88097
    Feb 29 '16 at 21:59


















0














This is not an answer but this part:



WITH PctCTE(id, percent, kind, count)
AS
( ... ),
mergeCTE(dups, i, p)
AS
(SELECT ROW_NUMBER () OVER (PARTITION BY id ORDER BY count) 'dups',
id, SUM(percent) OVER (PARTITION BY id)
FROM PctCTE
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, when.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning when
ON tmp.i = when.individual
WHERE dups = 1;


can be simplified to:



WITH PctCTE (id, percent, kind, count)
AS
( ... ),
mergeCTE (i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning AS w
ON tmp.i = w.individual ;


If that's correct and I'm not missing something, then the recursive PctCTE can also be simplified. The count column seems to be used no where at all.






share|improve this answer


























  • Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

    – user88097
    Feb 29 '16 at 21:59
















0












0








0







This is not an answer but this part:



WITH PctCTE(id, percent, kind, count)
AS
( ... ),
mergeCTE(dups, i, p)
AS
(SELECT ROW_NUMBER () OVER (PARTITION BY id ORDER BY count) 'dups',
id, SUM(percent) OVER (PARTITION BY id)
FROM PctCTE
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, when.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning when
ON tmp.i = when.individual
WHERE dups = 1;


can be simplified to:



WITH PctCTE (id, percent, kind, count)
AS
( ... ),
mergeCTE (i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning AS w
ON tmp.i = w.individual ;


If that's correct and I'm not missing something, then the recursive PctCTE can also be simplified. The count column seems to be used no where at all.






share|improve this answer















This is not an answer but this part:



WITH PctCTE(id, percent, kind, count)
AS
( ... ),
mergeCTE(dups, i, p)
AS
(SELECT ROW_NUMBER () OVER (PARTITION BY id ORDER BY count) 'dups',
id, SUM(percent) OVER (PARTITION BY id)
FROM PctCTE
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, when.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning when
ON tmp.i = when.individual
WHERE dups = 1;


can be simplified to:



WITH PctCTE (id, percent, kind, count)
AS
( ... ),
mergeCTE (i, p)
AS
(SELECT id, SUM(percent)
FROM PctCTE
GROUP BY id
)
INSERT INTO
relationship.home.persistedTemp
SELECT @id, tmp.i, tmp.p, w.yr
FROM mergeCTE AS tmp
LEFT JOIN relationship.home.beginning AS w
ON tmp.i = w.individual ;


If that's correct and I'm not missing something, then the recursive PctCTE can also be simplified. The count column seems to be used no where at all.







share|improve this answer














share|improve this answer



share|improve this answer








edited Feb 25 '16 at 10:09


























community wiki





2 revs
ypercubeᵀᴹ














  • Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

    – user88097
    Feb 29 '16 at 21:59





















  • Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

    – user88097
    Feb 29 '16 at 21:59



















Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

– user88097
Feb 29 '16 at 21:59







Good advice! Thank you. I implemented your suggested simplification and took about 20% off of the time it took to run an iteration against this stored procedure that returned 124389 rows, going from 2 min 27 sec to 1 min 57 sec. The only difference in the output was that it wasn't sorted by i. The sorting is much quicker to do elsewhere in the process.

– user88097
Feb 29 '16 at 21:59




















draft saved

draft discarded




















































Thanks for contributing an answer to Database Administrators Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f130373%2fcan-performance-be-improved-by-breaking-databases-up%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Liste der Baudenkmale in Friedland (Mecklenburg)

Single-Malt-Whisky

Czorneboh