Sunday, February 19, 2012

design question

I have a Customer dimension, that must be splitted in 3 dimensions: regular customers, special customers, and customers. This dimension is splited because the regular customers have other types of attributes then special customers. The common attributes are grouped in a customer dimension.

Also i have 2 fact tables: one contains new and closed customers, and the other one contains the new and closed customer accounts.

So, the scheme looks like that?

Regular customers->

Customers --> CustomersFactTable

Special customers > >AccountsFactTable

I want to know if this is the right design for an Analysis services 2005 cube?

If this is ok, i want to know how to handle the referential integrity between Regular Customer, Special Customers and the 2 fact tables? I tried to solve this with an unknown member, but even if the unknown member is invisible it's still rolled up to the total, which we don't want because the referential integrity is not a pure one, it's a build one.

Thank you

Assuming that Customers is a "superclass" table, with Regular and Special as "subclass" tables linked via the Customer key, you could try using the AS 2005 "many-many" dimension model as follows:

Customer dimension is configured with a regular relation to both Accounts and Customers measure groups. Regular and Special Customer measure groups are created for those tables, each with its own "row count" measure. Regular and Special Customer dimensions should now have a "fact" relation to their respective measure groups. Customer dimension has a regular relation to Regular and Special Customer measure groups via the Customer key. Regluar and Special dimensions are configured as "many-many" for both Accounts and Customers measure groups.|||

Thank you for your answer Deepak.

I test it your solution, and it solves the referential integrity problem, but it doesn't solve the grand total problem. If i select an attribute from Regular Customer dimension, and a measure from CustomerFactTable, the grand total is composed from the regular customers filtered by the attribute selected + the special customers that are in CustomerFactTable; which is wrong.

Can you tell me how i can solve this problem?

Thank you

|||

Project REAL had a similar question with regards to their Vendor dimension. You can read about their decision making process here:

http://www.microsoft.com/technet/prodtechnol/sql/2005/realastd.mspx

Search for "How vendors are represented" to find that section.

|||

Based on the Project REAL Vendor discussion, one approach would be to add a "Customer Type" dimension, with members like "Regular", "Special" (and "Other", if there are customers not in either subclass), and an "All" member. This dimension could be related to both the Regular and Special intermediate measure groups via a "Customer Type" named calculation attribute added to each table (= "Regular" for Regular table and = "Special" for Special table). Thus, for example, when "Customer Type" of "Regular" is selected, all special customers should get excluded from the CustomerFactTable.

Of course, this involves 1 extra dimension and user selection - if the user leaves "Customer Type" as "All", the behavior should be the same as before (without "Customer Type").

|||

Thank you Deepak for your answer.

This could be a solution, but i don't want that users select between regular or special customers. Of course i could create 3 types of calculated members: one for regular customers, one for special customers and one for the customers, and this way the user don't have to select between regular or special customers.

I have a last question: It is a good practice if i split the facts based on the regular and special customers? The design will look like this:

Regular Customers ->RegularCustomersFactTable

Customers ->RegularCustomerAccountsFactTable

->RegularCustomerTrransactionsFactTable

Special Customers ->SpecialCustomersFactTable

Customers ->SpecialCustomerAccountsFactTable

->SpecialCustomerTrransactionsFactTable

The facts split will be done in the DataSourceView through named query. This way i don't have the problem with the grand total and the overhead of the many to many dimension relationships.

In the datawarehouse i will keep the initial design ( based on this article of ralph kimball http://www.intelligententerprise.com/010629/warehouse1_1.jhtml ):

Regular Customer >

Customers -->CustomersFactTable

Special Customer > -->AccountsFactTable

-->TransactionsFactTable

So, Is this approach a good one?

Thank you

|||This approach seems fine, depending on what are the most important analytical scenarios. This makes it easier to analyze regular and special customers separately; but more difficult to analyze all customers, as a whole.

No comments:

Post a Comment