Showing posts with label schema. Show all posts
Showing posts with label schema. Show all posts

Friday, February 24, 2012

Design to accomodate entity based schema versioning

in simple words it's about versioning at record level.

Example

TableEmployee - EmployeeId, EmployeeName,
EmployeeAddress, DepartmentId,
TableDesignationMap - EmployeeId, DesignationId, EffectiveDate,
validity
TableDepartment - DepartmentId, Department
TableDesignation - DesignationId, designation

Via Modify-Employee-Details screen following are editable
EmoyeeName
EmployeeAddress
Department
Designation

this screen should allow user to navigate through changes history.
Example :

Version -1
EmoyeeName John Smith
EmployeeAddress 60 NewYork
Department Accounts
Designation Accountant

Version -2
EmoyeeName John Smith
EmployeeAddress 60 NewYork
Department Accounts
Designation Chief Accountant - changed

Version -3
EmoyeeName John Smith
EmployeeAddress 60 NewYork
Department Sales - changed
Designation Marketing Manager - changed

Question :
What is the best proposed database design for maintaining history
records bound with version and retrieval technique

Best Regards
SasankaSasanka Pinidiya (pinidiya@.yahoo.com) writes:
> in simple words it's about versioning at record level.
> Example
> TableEmployee - EmployeeId, EmployeeName,
> EmployeeAddress, DepartmentId,
> TableDesignationMap - EmployeeId, DesignationId, EffectiveDate,
> validity
> TableDepartment - DepartmentId, Department
> TableDesignation - DesignationId, designation
> Via Modify-Employee-Details screen following are editable
> EmoyeeName
> EmployeeAddress
> Department
> Designation
>...
> Question :
> What is the best proposed database design for maintaining history
> records bound with version and retrieval technique

There are several techniques, and which is the best depends on the actual
business case.

One techniques which makes it very simple to retrieve the data at a
certain date, is store one version each day. Clearly, this technique
is not appropriate here. This technique is good for things that changes
about daily as for instance currency prices.

Another technique is store key, date and then what changed. But this
makes is difficult to retrieve the complete information at a given point
in time.

So the best technique seems to be to have a table with the same columns
as the base table, and with a datetime column added to hold the time
of the change.

--
Erland Sommarskog, SQL Server MVP, sommar@.algonet.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp

Sunday, February 19, 2012

design question

Hi,
Please see the three table below, I'm designing the db schema so that it
maintains each Client has exactly 1 default service.
table ServiceType
(
ServiceTypeId,
...
...
)
table Client
(
ClientId,
...
...
)
table ClientAvailableService
(
ClientId foreign key
ServiceTypeId foreign key
IsDefaultService bit
...
...
)
The first possible way is to create IsDefaultService column in the
ClientAvailableService and use trigger to ensure that there is exactly one
row for each ClientId has the bit turns on, but I'm not sure if that i can
handle scenario when my application user wants to switch default service as
it would either need to turn the existing off first (which would immediately
violate the trigger checking) then turn on the new one OR turn on the new
one (which violates the trigger checking) before turning off the existing
one. Overall this strategy, if it works, would still be messy and
inefficient as the trigger needs to go through many rows in the table to
maintain it.
The second way is to do add a ranking column in decimal type or int type,
the lowest number would mean that it's the default, this would definitely
work but retrieving the default one would require reading and sorting many
rows.
The third way is to add another table
table ClientDefaultAvailableService
(
ClientId
ServiceTypeId
primarykey( ClientId, ServiceTypeId )
)
This would require joining to find out which one is default even when I need
to retrieve the entire list of ClientAvailableService for a particular
client.
Would someone out there have or know a better way to implement this or just
have some comment about this issue? I assume experienced db designers must
have run into this issue before. Thank you very much.
"Zeng" <Zeng5000@.hotmail.com> wrote in message
news:uuFBoM8kFHA.2484@.TK2MSFTNGP15.phx.gbl...
> Hi,
> Please see the three table below, I'm designing the db schema so that it
> maintains each Client has exactly 1 default service.
> table ServiceType
> (
> ServiceTypeId,
> ...
> ...
> )
> table Client
> (
> ClientId,
> ...
> ...
> )
> table ClientAvailableService
> (
> ClientId foreign key
> ServiceTypeId foreign key
> IsDefaultService bit
> ...
> ...
> )
>
....
>
> Would someone out there have or know a better way to implement this or
> just
> have some comment about this issue? I assume experienced db designers
> must
> have run into this issue before. Thank you very much.
>
>
The correct way to model this is to add a DefaultService column to Client.
There's really two distinct relationships beteen client and ServiceType.
Each client has one to many ServiceTypes, and Each client has at most one
Default ServiceType.
If you model both relationships then there's no need for triggers. You do
have to do some things in a certian order. For instance you have to add a
Client, then Add services then choose one of those services as the Default
service. But those operational constraints make good sense in the model.
create table ServiceType
(
ServiceTypeID int primary key
)
create table Client
(
ClientID int primary key,
DefaultService int null references ServiceType
)
create table ClientServiceType
(
ClientID int references Client on delete cascade,
ServiceType int references ServiceType,
constraint pk_ClientServiceType
primary key(ClientID,ServiceType)
)
--and this guarantees that the client's DefaultService
--is in that client's list of ServiceTypes.
alter table client
add constraint fk_DefaultServise
foreign key (ClientID,DefaultService) references ClientServiceType
insert into ServiceType values (1)
insert into ServiceType values (2)
insert into ServiceType values (3)
insert into Client(ClientID) values (1)
insert into ClientServiceType(ClientID,ServiceType)
values(1,1)
insert into ClientServiceType(ClientID,ServiceType)
values(1,2)
update client set DefaultService = 2 where ClientID =1
David
|||David, thank you so much for your help!
"David Browne" <davidbaxterbrowne no potted meat@.hotmail.com> wrote in
message news:O29kge8kFHA.3436@.tk2msftngp13.phx.gbl...
> "Zeng" <Zeng5000@.hotmail.com> wrote in message
> news:uuFBoM8kFHA.2484@.TK2MSFTNGP15.phx.gbl...
> ...
> The correct way to model this is to add a DefaultService column to Client.
> There's really two distinct relationships beteen client and ServiceType.
> Each client has one to many ServiceTypes, and Each client has at most one
> Default ServiceType.
> If you model both relationships then there's no need for triggers. You do
> have to do some things in a certian order. For instance you have to add a
> Client, then Add services then choose one of those services as the Default
> service. But those operational constraints make good sense in the model.
>
> create table ServiceType
> (
> ServiceTypeID int primary key
> )
> create table Client
> (
> ClientID int primary key,
> DefaultService int null references ServiceType
> )
> create table ClientServiceType
> (
> ClientID int references Client on delete cascade,
> ServiceType int references ServiceType,
> constraint pk_ClientServiceType
> primary key(ClientID,ServiceType)
> )
> --and this guarantees that the client's DefaultService
> --is in that client's list of ServiceTypes.
> alter table client
> add constraint fk_DefaultServise
> foreign key (ClientID,DefaultService) references ClientServiceType
>
> insert into ServiceType values (1)
> insert into ServiceType values (2)
> insert into ServiceType values (3)
> insert into Client(ClientID) values (1)
> insert into ClientServiceType(ClientID,ServiceType)
> values(1,1)
> insert into ClientServiceType(ClientID,ServiceType)
> values(1,2)
> update client set DefaultService = 2 where ClientID =1
> David
>
|||On Thu, 28 Jul 2005 15:19:59 -0700, Zeng wrote:
(snip)
>The second way is to do add a ranking column in decimal type or int type,
>the lowest number would mean that it's the default, this would definitely
>work but retrieving the default one would require reading and sorting many
>rows.
Hi Zeng,
Though David's suggestion is very good, a ranking column is a very good
idea as well. Finding the default would actually be fast if there is an
index (and there will be one, since the combination of ClientID / Rank
should be declared as either PRIMARY KEY or UNIQUE, and in both cases an
index is automatically added.
Best, Hugo
(Remove _NO_ and _SPAM_ to get my e-mail address)
|||Thanks!
"Hugo Kornelis" <hugo@.pe_NO_rFact.in_SPAM_fo> wrote in message
news:espie1djg0dtjsf5c5chp8u3bca6frl3oa@.4ax.com... [vbcol=seagreen]
> On Thu, 28 Jul 2005 15:19:59 -0700, Zeng wrote:
> (snip)
many
> Hi Zeng,
> Though David's suggestion is very good, a ranking column is a very good
> idea as well. Finding the default would actually be fast if there is an
> index (and there will be one, since the combination of ClientID / Rank
> should be declared as either PRIMARY KEY or UNIQUE, and in both cases an
> index is automatically added.
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)

Friday, February 17, 2012

design question

Hi,
Please see the three table below, I'm designing the db schema so that it
maintains each Client has exactly 1 default service.
table ServiceType
(
ServiceTypeId,
...
...
)
table Client
(
ClientId,
...
...
)
table ClientAvailableService
(
ClientId foreign key
ServiceTypeId foreign key
IsDefaultService bit
...
...
)
The first possible way is to create IsDefaultService column in the
ClientAvailableService and use trigger to ensure that there is exactly one
row for each ClientId has the bit turns on, but I'm not sure if that i can
handle scenario when my application user wants to switch default service as
it would either need to turn the existing off first (which would immediately
violate the trigger checking) then turn on the new one OR turn on the new
one (which violates the trigger checking) before turning off the existing
one. Overall this strategy, if it works, would still be messy and
inefficient as the trigger needs to go through many rows in the table to
maintain it.
The second way is to do add a ranking column in decimal type or int type,
the lowest number would mean that it's the default, this would definitely
work but retrieving the default one would require reading and sorting many
rows.
The third way is to add another table
table ClientDefaultAvailableService
(
ClientId
ServiceTypeId
primarykey( ClientId, ServiceTypeId )
)
This would require joining to find out which one is default even when I need
to retrieve the entire list of ClientAvailableService for a particular
client.
Would someone out there have or know a better way to implement this or just
have some comment about this issue? I assume experienced db designers must
have run into this issue before. Thank you very much."Zeng" <Zeng5000@.hotmail.com> wrote in message
news:uuFBoM8kFHA.2484@.TK2MSFTNGP15.phx.gbl...
> Hi,
> Please see the three table below, I'm designing the db schema so that it
> maintains each Client has exactly 1 default service.
> table ServiceType
> (
> ServiceTypeId,
> ...
> ...
> )
> table Client
> (
> ClientId,
> ...
> ...
> )
> table ClientAvailableService
> (
> ClientId foreign key
> ServiceTypeId foreign key
> IsDefaultService bit
> ...
> ...
> )
>
...
>
> Would someone out there have or know a better way to implement this or
> just
> have some comment about this issue? I assume experienced db designers
> must
> have run into this issue before. Thank you very much.
>
>
The correct way to model this is to add a DefaultService column to Client.
There's really two distinct relationships beteen client and ServiceType.
Each client has one to many ServiceTypes, and Each client has at most one
Default ServiceType.
If you model both relationships then there's no need for triggers. You do
have to do some things in a certian order. For instance you have to add a
Client, then Add services then choose one of those services as the Default
service. But those operational constraints make good sense in the model.
create table ServiceType
(
ServiceTypeID int primary key
)
create table Client
(
ClientID int primary key,
DefaultService int null references ServiceType
)
create table ClientServiceType
(
ClientID int references Client on delete cascade,
ServiceType int references ServiceType,
constraint pk_ClientServiceType
primary key(ClientID,ServiceType)
)
--and this guarantees that the client's DefaultService
--is in that client's list of ServiceTypes.
alter table client
add constraint fk_DefaultServise
foreign key (ClientID,DefaultService) references ClientServiceType
insert into ServiceType values (1)
insert into ServiceType values (2)
insert into ServiceType values (3)
insert into Client(ClientID) values (1)
insert into ClientServiceType(ClientID,ServiceType)
values(1,1)
insert into ClientServiceType(ClientID,ServiceType)
values(1,2)
update client set DefaultService = 2 where ClientID =1
David|||On Thu, 28 Jul 2005 15:19:59 -0700, Zeng wrote:
(snip)
>The second way is to do add a ranking column in decimal type or int type,
>the lowest number would mean that it's the default, this would definitely
>work but retrieving the default one would require reading and sorting many
>rows.
Hi Zeng,
Though David's suggestion is very good, a ranking column is a very good
idea as well. Finding the default would actually be fast if there is an
index (and there will be one, since the combination of ClientID / Rank
should be declared as either PRIMARY KEY or UNIQUE, and in both cases an
index is automatically added.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||David, thank you so much for your help!
"David Browne" <davidbaxterbrowne no potted meat@.hotmail.com> wrote in
message news:O29kge8kFHA.3436@.tk2msftngp13.phx.gbl...
> "Zeng" <Zeng5000@.hotmail.com> wrote in message
> news:uuFBoM8kFHA.2484@.TK2MSFTNGP15.phx.gbl...
> > Hi,
> >
> > Please see the three table below, I'm designing the db schema so that it
> > maintains each Client has exactly 1 default service.
> >
> > table ServiceType
> > (
> > ServiceTypeId,
> > ...
> > ...
> > )
> >
> > table Client
> > (
> > ClientId,
> > ...
> > ...
> > )
> >
> > table ClientAvailableService
> > (
> > ClientId foreign key
> > ServiceTypeId foreign key
> > IsDefaultService bit
> > ...
> > ...
> >
> > )
> >
> ...
> >
> >
> > Would someone out there have or know a better way to implement this or
> > just
> > have some comment about this issue? I assume experienced db designers
> > must
> > have run into this issue before. Thank you very much.
> >
> >
> >
> The correct way to model this is to add a DefaultService column to Client.
> There's really two distinct relationships beteen client and ServiceType.
> Each client has one to many ServiceTypes, and Each client has at most one
> Default ServiceType.
> If you model both relationships then there's no need for triggers. You do
> have to do some things in a certian order. For instance you have to add a
> Client, then Add services then choose one of those services as the Default
> service. But those operational constraints make good sense in the model.
>
> create table ServiceType
> (
> ServiceTypeID int primary key
> )
> create table Client
> (
> ClientID int primary key,
> DefaultService int null references ServiceType
> )
> create table ClientServiceType
> (
> ClientID int references Client on delete cascade,
> ServiceType int references ServiceType,
> constraint pk_ClientServiceType
> primary key(ClientID,ServiceType)
> )
> --and this guarantees that the client's DefaultService
> --is in that client's list of ServiceTypes.
> alter table client
> add constraint fk_DefaultServise
> foreign key (ClientID,DefaultService) references ClientServiceType
>
> insert into ServiceType values (1)
> insert into ServiceType values (2)
> insert into ServiceType values (3)
> insert into Client(ClientID) values (1)
> insert into ClientServiceType(ClientID,ServiceType)
> values(1,1)
> insert into ClientServiceType(ClientID,ServiceType)
> values(1,2)
> update client set DefaultService = 2 where ClientID =1
> David
>|||Thanks!
"Hugo Kornelis" <hugo@.pe_NO_rFact.in_SPAM_fo> wrote in message
news:espie1djg0dtjsf5c5chp8u3bca6frl3oa@.4ax.com...
> On Thu, 28 Jul 2005 15:19:59 -0700, Zeng wrote:
> (snip)
> >The second way is to do add a ranking column in decimal type or int type,
> >the lowest number would mean that it's the default, this would definitely
> >work but retrieving the default one would require reading and sorting
many
> >rows.
> Hi Zeng,
> Though David's suggestion is very good, a ranking column is a very good
> idea as well. Finding the default would actually be fast if there is an
> index (and there will be one, since the combination of ClientID / Rank
> should be declared as either PRIMARY KEY or UNIQUE, and in both cases an
> index is automatically added.
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)

design question

Hi,
Please see the three table below, I'm designing the db schema so that it
maintains each Client has exactly 1 default service.
table ServiceType
(
ServiceTypeId,
..
..
)
table Client
(
ClientId,
...
...
)
table ClientAvailableService
(
ClientId foreign key
ServiceTypeId foreign key
IsDefaultService bit
..
..
)
The first possible way is to create IsDefaultService column in the
ClientAvailableService and use trigger to ensure that there is exactly one
row for each ClientId has the bit turns on, but I'm not sure if that i can
handle scenario when my application user wants to switch default service as
it would either need to turn the existing off first (which would immediately
violate the trigger checking) then turn on the new one OR turn on the new
one (which violates the trigger checking) before turning off the existing
one. Overall this strategy, if it works, would still be messy and
inefficient as the trigger needs to go through many rows in the table to
maintain it.
The second way is to do add a ranking column in decimal type or int type,
the lowest number would mean that it's the default, this would definitely
work but retrieving the default one would require reading and sorting many
rows.
The third way is to add another table
table ClientDefaultAvailableService
(
ClientId
ServiceTypeId
primarykey( ClientId, ServiceTypeId )
)
This would require joining to find out which one is default even when I need
to retrieve the entire list of ClientAvailableService for a particular
client.
Would someone out there have or know a better way to implement this or just
have some comment about this issue? I assume experienced db designers must
have run into this issue before. Thank you very much."Zeng" <Zeng5000@.hotmail.com> wrote in message
news:uuFBoM8kFHA.2484@.TK2MSFTNGP15.phx.gbl...
> Hi,
> Please see the three table below, I'm designing the db schema so that it
> maintains each Client has exactly 1 default service.
> table ServiceType
> (
> ServiceTypeId,
> ...
> ...
> )
> table Client
> (
> ClientId,
> ...
> ...
> )
> table ClientAvailableService
> (
> ClientId foreign key
> ServiceTypeId foreign key
> IsDefaultService bit
> ...
> ...
> )
>
...
>
> Would someone out there have or know a better way to implement this or
> just
> have some comment about this issue? I assume experienced db designers
> must
> have run into this issue before. Thank you very much.
>
>
The correct way to model this is to add a DefaultService column to Client.
There's really two distinct relationships beteen client and ServiceType.
Each client has one to many ServiceTypes, and Each client has at most one
Default ServiceType.
If you model both relationships then there's no need for triggers. You do
have to do some things in a certian order. For instance you have to add a
Client, then Add services then choose one of those services as the Default
service. But those operational constraints make good sense in the model.
create table ServiceType
(
ServiceTypeID int primary key
)
create table Client
(
ClientID int primary key,
DefaultService int null references ServiceType
)
create table ClientServiceType
(
ClientID int references Client on delete cascade,
ServiceType int references ServiceType,
constraint pk_ClientServiceType
primary key(ClientID,ServiceType)
)
--and this guarantees that the client's DefaultService
--is in that client's list of ServiceTypes.
alter table client
add constraint fk_DefaultServise
foreign key (ClientID,DefaultService) references ClientServiceType
insert into ServiceType values (1)
insert into ServiceType values (2)
insert into ServiceType values (3)
insert into Client(ClientID) values (1)
insert into ClientServiceType(ClientID,ServiceType)
values(1,1)
insert into ClientServiceType(ClientID,ServiceType)
values(1,2)
update client set DefaultService = 2 where ClientID =1
David|||David, thank you so much for your help!
"David Browne" <davidbaxterbrowne no potted meat@.hotmail.com> wrote in
message news:O29kge8kFHA.3436@.tk2msftngp13.phx.gbl...
> "Zeng" <Zeng5000@.hotmail.com> wrote in message
> news:uuFBoM8kFHA.2484@.TK2MSFTNGP15.phx.gbl...
> ...
> The correct way to model this is to add a DefaultService column to Client.
> There's really two distinct relationships beteen client and ServiceType.
> Each client has one to many ServiceTypes, and Each client has at most one
> Default ServiceType.
> If you model both relationships then there's no need for triggers. You do
> have to do some things in a certian order. For instance you have to add a
> Client, then Add services then choose one of those services as the Default
> service. But those operational constraints make good sense in the model.
>
> create table ServiceType
> (
> ServiceTypeID int primary key
> )
> create table Client
> (
> ClientID int primary key,
> DefaultService int null references ServiceType
> )
> create table ClientServiceType
> (
> ClientID int references Client on delete cascade,
> ServiceType int references ServiceType,
> constraint pk_ClientServiceType
> primary key(ClientID,ServiceType)
> )
> --and this guarantees that the client's DefaultService
> --is in that client's list of ServiceTypes.
> alter table client
> add constraint fk_DefaultServise
> foreign key (ClientID,DefaultService) references ClientServiceType
>
> insert into ServiceType values (1)
> insert into ServiceType values (2)
> insert into ServiceType values (3)
> insert into Client(ClientID) values (1)
> insert into ClientServiceType(ClientID,ServiceType)
> values(1,1)
> insert into ClientServiceType(ClientID,ServiceType)
> values(1,2)
> update client set DefaultService = 2 where ClientID =1
> David
>|||On Thu, 28 Jul 2005 15:19:59 -0700, Zeng wrote:
(snip)
>The second way is to do add a ranking column in decimal type or int type,
>the lowest number would mean that it's the default, this would definitely
>work but retrieving the default one would require reading and sorting many
>rows.
Hi Zeng,
Though David's suggestion is very good, a ranking column is a very good
idea as well. Finding the default would actually be fast if there is an
index (and there will be one, since the combination of ClientID / Rank
should be declared as either PRIMARY KEY or UNIQUE, and in both cases an
index is automatically added.
Best, Hugo
--
(Remove _NO_ and _SPAM_ to get my e-mail address)|||Thanks!
"Hugo Kornelis" <hugo@.pe_NO_rFact.in_SPAM_fo> wrote in message
news:espie1djg0dtjsf5c5chp8u3bca6frl3oa@.
4ax.com...
> On Thu, 28 Jul 2005 15:19:59 -0700, Zeng wrote:
> (snip)
many[vbcol=seagreen]
> Hi Zeng,
> Though David's suggestion is very good, a ranking column is a very good
> idea as well. Finding the default would actually be fast if there is an
> index (and there will be one, since the combination of ClientID / Rank
> should be declared as either PRIMARY KEY or UNIQUE, and in both cases an
> index is automatically added.
> Best, Hugo
> --
> (Remove _NO_ and _SPAM_ to get my e-mail address)

Design opinion?

Hi
I am starting the task of creating a data warehouse for my company's OLTP
system and we would like to offload reporting to a reporting schema at a
data warehouse. We have 2 kinds of reports: operational (used daily and
need data just about real time) and analytical (used to determine sales
trends, etc).
For the analytical reports, I think we can just do a nightly ETL load since
the data doesn't need to be too recent. However, I was thinking of using
replication to replicate the table data used in the operational reports,
capturing the data in stored procedures as they come across the pipe and
then performing ETL within the stored procs. That way I think we can have
real time data available for reporting.
Does anyone have an opinion on this scheme?
Thank youI've done this kind of configuration before on SQL 2K. Transactional
replication is about the best you can do since log shipping takes the DB
offline. We preferred to pause replication during the staging portion of
the ETL cycle to avoid blocking and dead locking. This also gave us a
consistent point in time snap shot of the source to work from. Using TSQL
as an ETL tool is fast but has limitations around data transformation and no
access to the bulk load api.
Good luck,
Danny
"Dodo Lurker" <none@.noemailplease> wrote in message
news:BIKdneofYum6-2reRVn-pQ@.comcast.com...
> Hi
> I am starting the task of creating a data warehouse for my company's OLTP
> system and we would like to offload reporting to a reporting schema at a
> data warehouse. We have 2 kinds of reports: operational (used daily and
> need data just about real time) and analytical (used to determine sales
> trends, etc).
> For the analytical reports, I think we can just do a nightly ETL load
> since
> the data doesn't need to be too recent. However, I was thinking of
> using
> replication to replicate the table data used in the operational reports,
> capturing the data in stored procedures as they come across the pipe and
> then performing ETL within the stored procs. That way I think we can have
> real time data available for reporting.
> Does anyone have an opinion on this scheme?
> Thank you
>|||Hello,
Timing is everything.
I would work on getting your data warehouse overnight processes
completed.
In a short period of time new replication options will be available in
SQL Server 2005.
Even the existing options may meet your needs, but with the right
timing you may
have better options available to you.|||Depending on what method you use to migrate data to the warehouse (and also
the volitility of the OLTP data), this may impact the performance of the
OLTP system. Also, once the user expects analysis in real time, any
interruption will be perceived as a system malfunction, and thus your
maintenance requirements and responsilbility have increased. Do they really
need real time analysis? Also, what type of analysis; OLAP?
"Dodo Lurker" <none@.noemailplease> wrote in message
news:BIKdneofYum6-2reRVn-pQ@.comcast.com...
> Hi
> I am starting the task of creating a data warehouse for my company's OLTP
> system and we would like to offload reporting to a reporting schema at a
> data warehouse. We have 2 kinds of reports: operational (used daily and
> need data just about real time) and analytical (used to determine sales
> trends, etc).
> For the analytical reports, I think we can just do a nightly ETL load
> since
> the data doesn't need to be too recent. However, I was thinking of
> using
> replication to replicate the table data used in the operational reports,
> capturing the data in stored procedures as they come across the pipe and
> then performing ETL within the stored procs. That way I think we can have
> real time data available for reporting.
> Does anyone have an opinion on this scheme?
> Thank you
>

Design opinion?

Hi
I am starting the task of creating a data warehouse for my company's OLTP
system and we would like to offload reporting to a reporting schema at a
data warehouse. We have 2 kinds of reports: operational (used daily and
need data just about real time) and analytical (used to determine sales
trends, etc).
For the analytical reports, I think we can just do a nightly ETL load since
the data doesn't need to be too recent. However, I was thinking of using
replication to replicate the table data used in the operational reports,
capturing the data in stored procedures as they come across the pipe and
then performing ETL within the stored procs. That way I think we can have
real time data available for reporting.
Does anyone have an opinion on this scheme?
Thank you
I've done this kind of configuration before on SQL 2K. Transactional
replication is about the best you can do since log shipping takes the DB
offline. We preferred to pause replication during the staging portion of
the ETL cycle to avoid blocking and dead locking. This also gave us a
consistent point in time snap shot of the source to work from. Using TSQL
as an ETL tool is fast but has limitations around data transformation and no
access to the bulk load api.
Good luck,
Danny
"Dodo Lurker" <none@.noemailplease> wrote in message
news:BIKdneofYum6-2reRVn-pQ@.comcast.com...
> Hi
> I am starting the task of creating a data warehouse for my company's OLTP
> system and we would like to offload reporting to a reporting schema at a
> data warehouse. We have 2 kinds of reports: operational (used daily and
> need data just about real time) and analytical (used to determine sales
> trends, etc).
> For the analytical reports, I think we can just do a nightly ETL load
> since
> the data doesn't need to be too recent. However, I was thinking of
> using
> replication to replicate the table data used in the operational reports,
> capturing the data in stored procedures as they come across the pipe and
> then performing ETL within the stored procs. That way I think we can have
> real time data available for reporting.
> Does anyone have an opinion on this scheme?
> Thank you
>
|||Hello,
Timing is everything.
I would work on getting your data warehouse overnight processes
completed.
In a short period of time new replication options will be available in
SQL Server 2005.
Even the existing options may meet your needs, but with the right
timing you may
have better options available to you.
|||Depending on what method you use to migrate data to the warehouse (and also
the volitility of the OLTP data), this may impact the performance of the
OLTP system. Also, once the user expects analysis in real time, any
interruption will be perceived as a system malfunction, and thus your
maintenance requirements and responsilbility have increased. Do they really
need real time analysis? Also, what type of analysis; OLAP?
"Dodo Lurker" <none@.noemailplease> wrote in message
news:BIKdneofYum6-2reRVn-pQ@.comcast.com...
> Hi
> I am starting the task of creating a data warehouse for my company's OLTP
> system and we would like to offload reporting to a reporting schema at a
> data warehouse. We have 2 kinds of reports: operational (used daily and
> need data just about real time) and analytical (used to determine sales
> trends, etc).
> For the analytical reports, I think we can just do a nightly ETL load
> since
> the data doesn't need to be too recent. However, I was thinking of
> using
> replication to replicate the table data used in the operational reports,
> capturing the data in stored procedures as they come across the pipe and
> then performing ETL within the stored procs. That way I think we can have
> real time data available for reporting.
> Does anyone have an opinion on this scheme?
> Thank you
>

Design of Scheduled Task data schema

I am designing a table structure for scheduled tasks. Tasks must be
scheduled either on a fixed interval or at specific times. Those specific
times can be either every day or specific days of every w.
I have come up with the plan I'll paste in here, but the NULL's and one
field telling how to interpret data from another table leave me suspecting
there's a better way. I'm designing this from scratch so am free to change
as needed.
Suggestions?
Use Pubs
Create Table SchedTask(
TaskCode nvarchar (6) not null Primary Key,
TaskDescription nvarchar (30) ,
ScheduleType char(1) not null Check ( ScheduleType in ('I', 'D',
'W')), --for Interval, Daily, Wly
ScheduleInterval SmallInt NULL
)
go
Create Table SchedDay(
TaskCode nvarchar (6) Foreign Key References SchedTask (TaskCode),
DayOfW TinyInt not null Check ( DayOfW Between 0 and 7), --0
indicates it doesn't matter, 1-7 map to Sunday-Saturday
TimeOfDay SmallDateTime not null
)
go
--Add a task to repeat every 10 minutes
Insert Into SchedTask (TaskCode, TaskDescription, ScheduleType,
ScheduleInterval)
Values ('10M', 'Repeat every 10 minutes', 'I', 10)
--Add a task to occur at 9:00 AM every day
Insert Into SchedTask (TaskCode, TaskDescription, ScheduleType,
ScheduleInterval)
Values ('9D', 'occur at 9:00 AM every day', 'D', NULL)
Insert Into SchedDay (TaskCode, DayOfW, TimeOfDay)
Values ('9D', 0, '9:00')
--Add a task to occur at 9:00 AM every Monday and 11:45 every Friday
Insert Into SchedTask (TaskCode, TaskDescription, ScheduleType,
ScheduleInterval)
Values ('2W', 'Do this twice a w', 'W', NULL)
Insert Into SchedDay (TaskCode, DayOfW, TimeOfDay)
Values ('2W', 2, '9:00')
Insert Into SchedDay (TaskCode, DayOfW, TimeOfDay)
Values ('2W', 6, '11:45')
--Show the data
Select * From SchedTask
Select * From SchedDay
--Now clean up the DB
drop table SchedDay
go
drop table SchedTask
go
Thanks.
Daniel Wilson
http://Embtrak.com Development TeamSee if you can get some inspiration from the sysjobschedules database in the
msdb database. It's the table that SQL Server Agent uses to schedule jobs
and it is documented in SQL Server Books Online.
Jacco Schalkwijk
SQL Server MVP
"Daniel Wilson" <d.wilson@.Embtrak.com> wrote in message
news:edmiKCcyFHA.1252@.TK2MSFTNGP09.phx.gbl...
>I am designing a table structure for scheduled tasks. Tasks must be
> scheduled either on a fixed interval or at specific times. Those specific
> times can be either every day or specific days of every w.
> I have come up with the plan I'll paste in here, but the NULL's and one
> field telling how to interpret data from another table leave me suspecting
> there's a better way. I'm designing this from scratch so am free to change
> as needed.
> Suggestions?
> Use Pubs
> Create Table SchedTask(
> TaskCode nvarchar (6) not null Primary Key,
> TaskDescription nvarchar (30) ,
> ScheduleType char(1) not null Check ( ScheduleType in ('I', 'D',
> 'W')), --for Interval, Daily, Wly
> ScheduleInterval SmallInt NULL
> )
> go
> Create Table SchedDay(
> TaskCode nvarchar (6) Foreign Key References SchedTask (TaskCode),
> DayOfW TinyInt not null Check ( DayOfW Between 0 and 7), --0
> indicates it doesn't matter, 1-7 map to Sunday-Saturday
> TimeOfDay SmallDateTime not null
> )
> go
> --Add a task to repeat every 10 minutes
> Insert Into SchedTask (TaskCode, TaskDescription, ScheduleType,
> ScheduleInterval)
> Values ('10M', 'Repeat every 10 minutes', 'I', 10)
> --Add a task to occur at 9:00 AM every day
> Insert Into SchedTask (TaskCode, TaskDescription, ScheduleType,
> ScheduleInterval)
> Values ('9D', 'occur at 9:00 AM every day', 'D', NULL)
> Insert Into SchedDay (TaskCode, DayOfW, TimeOfDay)
> Values ('9D', 0, '9:00')
> --Add a task to occur at 9:00 AM every Monday and 11:45 every Friday
> Insert Into SchedTask (TaskCode, TaskDescription, ScheduleType,
> ScheduleInterval)
> Values ('2W', 'Do this twice a w', 'W', NULL)
> Insert Into SchedDay (TaskCode, DayOfW, TimeOfDay)
> Values ('2W', 2, '9:00')
> Insert Into SchedDay (TaskCode, DayOfW, TimeOfDay)
> Values ('2W', 6, '11:45')
> --Show the data
> Select * From SchedTask
> Select * From SchedDay
> --Now clean up the DB
> drop table SchedDay
> go
> drop table SchedTask
> go
> Thanks.
> Daniel Wilson
> http://Embtrak.com Development Team
>
>|||But whatever you do, don't use ints for date and time values. :-)
A
"Jacco Schalkwijk" <jacco.please.reply@.to.newsgroups.mvps.org.invalid> wrote
in message news:%23gJJ2VcyFHA.2932@.TK2MSFTNGP10.phx.gbl...
> See if you can get some inspiration from the sysjobschedules database in
> the msdb database. It's the table that SQL Server Agent uses to schedule
> jobs and it is documented in SQL Server Books Online.
> --
> Jacco Schalkwijk
> SQL Server MVP
>
> "Daniel Wilson" <d.wilson@.Embtrak.com> wrote in message
> news:edmiKCcyFHA.1252@.TK2MSFTNGP09.phx.gbl...
>|||Thanks. I'll think that over.
And I'll definitely NOT use integers for times!
dwilson
"Jacco Schalkwijk" <jacco.please.reply@.to.newsgroups.mvps.org.invalid> wrote
in message news:%23gJJ2VcyFHA.2932@.TK2MSFTNGP10.phx.gbl...
> See if you can get some inspiration from the sysjobschedules database in
the
> msdb database. It's the table that SQL Server Agent uses to schedule jobs
> and it is documented in SQL Server Books Online.
> --
> Jacco Schalkwijk
> SQL Server MVP
>
> "Daniel Wilson" <d.wilson@.Embtrak.com> wrote in message
> news:edmiKCcyFHA.1252@.TK2MSFTNGP09.phx.gbl...
specific
suspecting
change
>

Tuesday, February 14, 2012

Design datawarehouse schema

Hi,
I need to design a datawarehouse for the manufacturing industry. I'm
sure that this has been done a dozen times, and I was wondering if
there are workgroups where I can find a sample schema, reports and
others?
Looking forward to your reply,
DirkIs there such a thing as a 'standard datawarehouse' for ANY industry'
Seems to me that too much is variable, unless you are only storing industry
standard EDI stuff like Purchase Orders, Healthcare Claims, etc.
TheSQLGuru
President
Indicium Resources, Inc.
"locusta74ster@.gmail.com" <locusta74@.gmail.com> wrote in message
news:1174924469.776834.285710@.d57g2000hsg.googlegroups.com...
> Hi,
> I need to design a datawarehouse for the manufacturing industry. I'm
> sure that this has been done a dozen times, and I was wondering if
> there are workgroups where I can find a sample schema, reports and
> others?
> Looking forward to your reply,
> Dirk
>|||On Mar 27, 4:40 am, "TheSQLGuru" <kgbo...@.earthlink.net> wrote:
> Is there such a thing as a 'standard datawarehouse' for ANY industry'
> Seems to me that too much is variable, unless you are only storing industr
y
> standard EDI stuff like Purchase Orders, Healthcare Claims, etc.
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "locusta74s...@.gmail.com" <locust...@.gmail.com> wrote in message
> news:1174924469.776834.285710@.d57g2000hsg.googlegroups.com...
>
>
>
>
> - Show quoted text -
Hi SQL Guru,
well, yes, there is....over the last few years we have developed
models across a number of industries and they are quite
standardised...we have also developed applications on top of
them...today this all sits on MSFT only.....
Currently they sit at 500 tables and 7,000 fields. We use them as the
base for when we start a project. They are still in 'early adopter'
phase but we will be bringing them to market soon enough...hence
commenting on them here...
What we have essentially invented is a way of being able to customise
analytical applications, from front to back, at the lowest possible
cost....and 'to back' I mean riht back to the extraction out of the
source system.
Of course, there are few good models out there for 'free'. And no-one
made any money out of just selling models which is why more do not
exist...the three companies that sell models, IBM, NCR and Sybase
have made 'rounding errors' on model revenues.....so IBM and NCR now
insist that if you want the model you have to buy the rest as well...
This is a reaction to the fact that the model is the 'heart and soul'
of the EDW and is the majority of the IP...and companies only wanted
to buy the model for peanuts and not take the rest of the
'solution'....and in the end, no-one made any money out of them. We
are only selling our models separately from the rest of the work we
have done in very specific cases. Mostly, we would like our clients to
buy the ETL and the apps as well as the models... :-)
Best Regards
Peter
www.peternolan.com

Design datawarehouse schema

Hi,
I need to design a datawarehouse for the manufacturing industry. I'm
sure that this has been done a dozen times, and I was wondering if
there are workgroups where I can find a sample schema, reports and
others?
Looking forward to your reply,
Dirk
Is there such a thing as a 'standard datawarehouse' for ANY industry?
Seems to me that too much is variable, unless you are only storing industry
standard EDI stuff like Purchase Orders, Healthcare Claims, etc.
TheSQLGuru
President
Indicium Resources, Inc.
"locusta74ster@.gmail.com" <locusta74@.gmail.com> wrote in message
news:1174924469.776834.285710@.d57g2000hsg.googlegr oups.com...
> Hi,
> I need to design a datawarehouse for the manufacturing industry. I'm
> sure that this has been done a dozen times, and I was wondering if
> there are workgroups where I can find a sample schema, reports and
> others?
> Looking forward to your reply,
> Dirk
>
|||On Mar 27, 4:40 am, "TheSQLGuru" <kgbo...@.earthlink.net> wrote:
> Is there such a thing as a 'standard datawarehouse' for ANY industry?
> Seems to me that too much is variable, unless you are only storing industry
> standard EDI stuff like Purchase Orders, Healthcare Claims, etc.
> --
> TheSQLGuru
> President
> Indicium Resources, Inc.
> "locusta74s...@.gmail.com" <locust...@.gmail.com> wrote in message
> news:1174924469.776834.285710@.d57g2000hsg.googlegr oups.com...
>
>
>
> - Show quoted text -
Hi SQL Guru,
well, yes, there is....over the last few years we have developed
models across a number of industries and they are quite
standardised...we have also developed applications on top of
them...today this all sits on MSFT only.....
Currently they sit at 500 tables and 7,000 fields. We use them as the
base for when we start a project. They are still in 'early adopter'
phase but we will be bringing them to market soon enough...hence
commenting on them here...
What we have essentially invented is a way of being able to customise
analytical applications, from front to back, at the lowest possible
cost....and 'to back' I mean riht back to the extraction out of the
source system.
Of course, there are few good models out there for 'free'. And no-one
made any money out of just selling models which is why more do not
exist...the three companies that sell models, IBM, NCR and Sybase
have made 'rounding errors' on model revenues.....so IBM and NCR now
insist that if you want the model you have to buy the rest as well...
This is a reaction to the fact that the model is the 'heart and soul'
of the EDW and is the majority of the IP...and companies only wanted
to buy the model for peanuts and not take the rest of the
'solution'....and in the end, no-one made any money out of them. We
are only selling our models separately from the rest of the work we
have done in very specific cases. Mostly, we would like our clients to
buy the ETL and the apps as well as the models... :-)
Best Regards
Peter
www.peternolan.com

Design advice please...

Without a more thorough understanding of your database schema and the
business model you're trying to implement, I can only offer suggestions.
First I think using NULL to handle these values can make sense. However in
this case, I expect that I would create a Member table that had a MemberID
and MemberType columns. I would also create StudentDetails, AffiliateDetails
and RegularDetails tables and cross-reference these on the MemberID PK/FK.
This way you could hold common columns in the Member table and type-specific
information in the FK tables.
hth
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
"Matt" <awpilotnospam@.aol.com> wrote in message
news:jz6yc.66258$3x.20120@.attbi_s54...
> I'm developing a database for someone that will involve user accounts.
The
> accounts will hold certain data. There are different types of users: e.g.
> regular members, affiliate members, and student members. The problem is
> that not all the accounts will hold the same type of data. The difference
> between a regular member and a student member is that the regular member
has
> a doctoral degree, a date at which that degree was received, and some
other
> information; the student member has info about their current institution,
> degree type, and a faculty sponsor name.
> I think it would be unwise to put all these fields into one table and then
> leave null whichever fields are unneeded.
> Would it be good practice to have a users table with user name, password,
> email, and member type (regular, student, etc.) and then have separate
> tables for each type of member? For example, separate tables for regular
> members, student members, affiliate members, etc.
> Thanks.
> Matt
>
> Without a more thorough understanding of your database schema and the
> business model you're trying to implement, I can only offer suggestions.
> First I think using NULL to handle these values can make sense. However in
> this case, I expect that I would create a Member table that had a MemberID
> and MemberType columns. I would also create StudentDetails,
AffiliateDetails
> and RegularDetails tables and cross-reference these on the MemberID PK/FK.
> This way you could hold common columns in the Member table and
type-specific
> information in the FK tables.
Thanks!
Matt

Deserialize xml data from sql 2005

We're having a hard time with SQL 05's xml.
In a nutshell, I have a schema. I have used .Net 2.0's xsd tool to create classes from this schema.

This schema is also tied to a XML column in my database server. So far so good.

However, when I go to deserialize the xml data stored in the database server I'm getting inconsistent results. Namely the differences in the following queries:
SELECT LoanXML.query('declare namespace MISMO="http://mrgdev.local/mismo/";//MISMO:LOAN')
FROM Loans
WHERE Loans.InternalID = @.LoanID

returns something completely different from:
SELECT LoanXML from Loans where InternalID = @.LoanID

The difference is that the first query throws xsi:nil=true into EVERY SINGLE NODE. Whereas the second one doesn't.

When we go to deserialize, then the nodes aren't converting into objects even though they are properly filled out...

I'd just chuck the .query syntax, except for a little problem where I'm trying to return just the node(s) I want instead of the whole thing.

Is there anyone else using xml deserialization with SQL 05?<xs:schema

xmlns:xs="http://www.w3.org/2001/XMLSchema"

xmlns="http://mrgdev.local/mismo/"

targetNamespace="http://mrgdev.local/mismo/"

elementFormDefault="qualified">
<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation=".\xml.xsd"/>
<xs:element name="LOAN" nillable="false">
<xs:complexType>
<xs:sequence>
<xs:element name="_APPLICATION" nillable="true" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element name="_DATA_INFORMATION" nillable="true" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element ref="DATA_VERSION"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>

<xs:element name="DATA_VERSION" nillable="true">
<xs:complexType>
<xs:attribute name="_Name" type="xs:string" use="required"/>
<xs:attribute name="_Number" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>|||CREATE XML SCHEMA COLLECTION [dbo].[ChrisSchema]
AS N'<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://mrgdev.local/test/"
xmlns:dv="http://mrgdev.local/DV/"
targetNamespace="http://mrgdev.local/test/" elementFormDefault="qualified">
<xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation=".\xml.xsd"/>
<xs:element name="LOAN" nillable="false">
<xs:complexType>
<xs:sequence>
<xs:element name="_APPLICATION" nillable="true" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element name="_DATA_INFORMATION" nillable="true" minOccurs="0">
<xs:complexType>
<xs:sequence>
<xs:element ref="DATA_VERSION"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="DATA_VERSION" nillable="true">
<xs:complexType>
<xs:attribute name="_Name" type="xs:string" use="required"/>
<xs:attribute name="_Number" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>
'|||The following creates a table and inserts a row with xml data into it.

CREATE TABLE [dbo].[MyTest](
[id] [uniqueidentifier] NOT NULL CONSTRAINT [DF_mytest_id] DEFAULT (newid()),
[test] [xml](CONTENT [dbo].[ChrisSchema]) NULL,
CONSTRAINT [PK_chris] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]

insert into MyTest(test)
values('<LOAN xmlns="http://mrgdev.local/test/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://mrgdev.local/test/">
<_APPLICATION>
<_DATA_INFORMATION>
<DATA_VERSION _Name="DVTESt" _Number="123123"/>
</_DATA_INFORMATION>
</_APPLICATION>
</LOAN>')|||Now, execute

SELECT test.query('declare namespace test="http://mrgdev.local/test/";//test:LOAN')
FROM MyTest

You're result will be:

<LOAN xmlns="http://mrgdev.local/test/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://mrgdev.local/test/">
<_APPLICATION>
<_DATA_INFORMATION>
<DATA_VERSION _Name="DVTESt" _Number="123123" xsi:nil="true" />
</_DATA_INFORMATION>
</_APPLICATION>
</LOAN>

Notice the xsi:nil=true... First off, that's wrong. It isn't nil. Second, this will not deserialize. WTF?|||If I just run

select test from MyTest

then the data looks like:

<LOAN xmlns="http://mrgdev.local/test/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://mrgdev.local/test/">
<_APPLICATION>
<_DATA_INFORMATION>
<DATA_VERSION _Name="DVTESt" _Number="123123" />
</_DATA_INFORMATION>
</_APPLICATION>
</LOAN>

In this case the data_version node correctly comes back WITHOUT the xsi:nil="true"
In addition, if we UNTYPE the xml column, then it will also come back without the xsi:nil="true"garbage; but that's not a real option.

How do we get the typed version to not do this? I even tried placing the definition for DATA_VERSION in a different schema, but this had zero effect on our output.|||After all of our testing, it appears that if an element only has attributes then it reports the element as xsi:nil="true" which in our case is wrong, because our elements only have attributes.

Is there some way to get around this?