I'm trying to normalize a table but I'm not sure what the proper way to
do it is:
Imagine you have these two tables:
Commodities
===========
CommodityID (primary key)
Description
UnitWeight
UnitValue
UnitOfMeasure
CountryOfManufacture
HTSCode
ExportLicenseRequiredFg
PackageContents
===============
ShipmentID (primary key)
PackageID (primary key)
CommodityID (primary key)
Description
UnitWeight
UnitValue
UnitOfMeasure
Quantity
LineIndex
Notice that Description/UnitWeight/UnitValue/UnitOfMeasure are in both
tables.
Since PackageContents can be thought of as "the current commodity table
values at the time of shipment", how do you normalize these tables since
the values in Commodity can change but you need to keep the values at
the time of shipment for reporting purposes? It seems like a waste of
space to store those columns in two tables.
Thanks,
StevePlease post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, data types, etc. in
your schema are. Sample data is also a good idea, along with clear
specifications. It is very hard to debug code when you do not let us
see it.|||Steve Beach wrote:
> I'm trying to normalize a table but I'm not sure what the proper way to
> do it is:
> Imagine you have these two tables:
> Commodities
> ===========
> CommodityID (primary key)
> Description
> UnitWeight
> UnitValue
> UnitOfMeasure
> CountryOfManufacture
> HTSCode
> ExportLicenseRequiredFg
> PackageContents
> ===============
> ShipmentID (primary key)
> PackageID (primary key)
> CommodityID (primary key)
> Description
> UnitWeight
> UnitValue
> UnitOfMeasure
> Quantity
> LineIndex
>
> Notice that Description/UnitWeight/UnitValue/UnitOfMeasure are in both
> tables.
> Since PackageContents can be thought of as "the current commodity table
> values at the time of shipment", how do you normalize these tables since
> the values in Commodity can change but you need to keep the values at
> the time of shipment for reporting purposes? It seems like a waste of
> space to store those columns in two tables.
> Thanks,
> Steve
For example you might put commodity shipments as transactions in a
stock control table. Movements in as positive values and movements out
as negatives. The present balance is the sum of the transactions. You
record the shipment number for each transaction so you can also derive
the shipment total from the same data.
> It seems like a waste of
> space to store those columns in two tables.
Probably, but the most important reason to eliminate redundancy is to
avoid the risk of inconsistent data and incorrect results. Reducing the
storage requirement is mostly only of secondary importance.
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Celko,
I know you hate "_PK" and whatnot on the constraint names. This is
generated code that I haven't massaged yet. Gimme some slack. :)
Thanks.
create table "PackageContents" (
"ShipmentID" varchar(24) not null,
"PackageID" int not null,
"CommodityID" varchar(40) not null,
"Description" varchar(80) null,
"UnitWeight" decimal(10,2) null,
"UnitValue" decimal(10,2) null,
"UnitOfMesaure" varchar(10) null,
"Quantity" decimal(10,2) null,
"LineIndex" varchar(5) null)
go
alter table "PackageContents"
add constraint "PackageContents_PK" primary key ("PackageID",
"ShipmentID", "CommodityID")
go
create table "Commmodities" (
"CommodityID" varchar(40) not null,
"Description" varchar(80) null,
"UnitWeight" decimal(10,2) null,
"UnitValue" decimal(10,2) null,
"UnitOfMeasure" varchar(10) null,
"CountryOfManufacture" varchar(12) null,
"HTSCode" varchar(10) null,
"ExportLicenseRequiredFg" varchar(1) null)
go
alter table "Commmodities"
add constraint "Commmodities_PK" primary key ("CommodityID")
go
--CELKO-- said the following on 1/23/2006 4:34 PM:
> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, data types, etc. in
> your schema are. Sample data is also a good idea, along with clear
> specifications. It is very hard to debug code when you do not let us
> see it.
>|||> Probably, but the most important reason to eliminate redundancy is to
> avoid the risk of inconsistent data and incorrect results. Reducing the
> storage requirement is mostly only of secondary importance.
I help development warehouse management software. Right now we store
BillTo and ShipTo information for every order in out SalesOrder table.
For the most part, the information is the same for every distinct
CustomerID, but occasionally the shipto address may change for one
order. Storing 99% of the same order information (for the same
customer) on every row has always bugged me.
I'm currently working on this new project which involves storing what
was shipped, and I wanted to try and avoid storing redundant data on
ever row in the table if that data mostly doesn't change.
So if there was a way to not store
Description/UnitWeight/UnitValue/UnitOfMeasure on every PackageContents
row, that would be nice. I was just seeing what other people might do
in this situation. Perhaps it isn't a big deal since it is historical
data at this point. I should just learn to live with the wasted space.
Thanks for the input.
Steve
David Portas said the following on 1/23/2006 4:43 PM:
> Steve Beach wrote:
> For example you might put commodity shipments as transactions in a
> stock control table. Movements in as positive values and movements out
> as negatives. The present balance is the sum of the transactions. You
> record the shipment number for each transaction so you can also derive
> the shipment total from the same data.
>
> Probably, but the most important reason to eliminate redundancy is to
> avoid the risk of inconsistent data and incorrect results. Reducing the
> storage requirement is mostly only of secondary importance.
> --
> David Portas, SQL Server MVP
> Whenever possible please post enough code to reproduce your problem.
> Including CREATE TABLE and INSERT statements usually helps.
> State what version of SQL Server you are using and specify the content
> of any error messages.
> SQL Server Books Online:
> http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
> --
>|||Steve Beach wrote:
> but occasionally the shipto address may change for one
> order. Storing 99% of the same order information (for the same
> customer) on every row has always bugged me.
>
So why not store that information again ONLY in the 1% of cases when it
changes? No reason why you have to do that all in one table. What stops
you putting the key and the changed info for the 1% of cases in its own
table?
David Portas, SQL Server MVP
Whenever possible please post enough code to reproduce your problem.
Including CREATE TABLE and INSERT statements usually helps.
State what version of SQL Server you are using and specify the content
of any error messages.
SQL Server Books Online:
http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
--|||Actually, that is ISO-11179 standards AND **basic data modeling** which
you probably never did. You have no keys, and more nulls than appear
in the **entire** RDBMS of a major autombile company. Gee, why would
common sense plus 35 years of expereince lend me to beleive that you
are screwed up'
For example, why does anyone have a shipment_id thatr can be CHAR(24)'
Don't we usually use numbers and a check digit? Why did you invent
that absurd CommodityID --there is no industry standard for that data
element? Did you look' Duh!
Why aren't some of these thiings in an inventory table instead of being
repeated over an dover is packages? Etc. ?
Most of what you have done is unresearched and/or flat out wrong and
needs to be done over from basics,
In the event that your failure to act on this leads to damages, this
will appear in court as evidence of prior knowledge and must be
disclosed in discovery.|||There is nothing stopping me from doing that. I wondered how other
people handled this situation.
I admit that I'm not strong at DB design but I'm not a rookie. I'm
asking this question because I'd like to learn how I should be doing it
(or ways that it can be done successfully).
Thanks for the input.
David Portas said the following on 1/23/2006 5:31 PM:
> Steve Beach wrote:
> So why not store that information again ONLY in the 1% of cases when it
> changes? No reason why you have to do that all in one table. What stops
> you putting the key and the changed info for the 1% of cases in its own
> table?
> --
> David Portas, SQL Server MVP
> Whenever possible please post enough code to reproduce your problem.
> Including CREATE TABLE and INSERT statements usually helps.
> State what version of SQL Server you are using and specify the content
> of any error messages.
> SQL Server Books Online:
> http://msdn2.microsoft.com/library/ms130214(en-US,SQL.90).aspx
> --
>|||Joe,
I appreciate the time you took to write that. I actually expected
something similar to your response.
Yes, there are nulls. I haven't taken them out yet. As I said, it is
generated code that I hadn't massaged yet. When all is said and done,
they will be no nulls.
shipment_id could be numbers and a check digit, but when humans can
enter their own shipment_id, they don't tend to think that way. That is
why it is alphanumeric. They might want to say "Joe's Shipment" or
something else for manual shipments. But typically it'll be numeric
because that is what the ERP system is.
As for the Commodities table, I chose that because it made more sense
then Items. (This is for a Manifesting System.) I've been researching
FedEx/UPS international shipping and they use Commodities everywhere. I
could have used item_id or item_code or something similar. But
CommodityID made more sense since the table name is Commodities. (For
some reason I like to include the table name in key if the column is an
"ID" column.) Using sku, vin, upc, etc are not available as anything can
be shipped.
Since this is a manifesting system, there isn't an inventory table. The
Commodities table is considered the "item master", but there is no
physical inventory.) That is why weight, cost, description , etc are in
that table. I'm trying to avoid putting item level information in the
package table over and over and over again. But I need to record the
current state of those values at the time of shipment. That is the
whole point of this thread.
I do have some of your books. Specifically, SQL Programming Style and
SQL For Smarties. You probably don't want to hear this, but I named my
table Commodities because I wanted to try and follow your style. I would
have named it "Item" before reading your book. I haven't gotten all the
way through your books yet, but I'm working on it.
Steve
--CELKO-- said the following on 1/23/2006 5:59 PM:
> Actually, that is ISO-11179 standards AND **basic data modeling** which
> you probably never did. You have no keys, and more nulls than appear
> in the **entire** RDBMS of a major autombile company. Gee, why would
> common sense plus 35 years of expereince lend me to beleive that you
> are screwed up'
> For example, why does anyone have a shipment_id thatr can be CHAR(24)'
> Don't we usually use numbers and a check digit? Why did you invent
> that absurd CommodityID --there is no industry standard for that data
> element? Did you look' Duh!
> Why aren't some of these thiings in an inventory table instead of being
> repeated over an dover is packages? Etc. ?
> Most of what you have done is unresearched and/or flat out wrong and
> needs to be done over from basics,
> In the event that your failure to act on this leads to damages, this
> will appear in court as evidence of prior knowledge and must be
> disclosed in discovery.
>
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment