I’m not sure whether this ought to be in the Architecture or
SQL Analysis Services forum.I am after
some initial advice about a suitable structure for a data warehouse which will
be used to build OLAP cubes AS2005.
I work in a county education department.Each year we create/update a set of reports
and procedures which provide schools with online access to many of the
statistics for their school compared to district wide and to national targets,
etc.
After this year’s, which is my second at creating these
report/data, I can’t help but think that the whole thing is screaming out for a
Data Warehouse and Cube data to report on.
I may be wrong on that, as I’m still only doing the learning regarding
AS2005 and haven’t touched a previous version of it.I’m currently working through this book:
Microsoft SQL Server 2005 Analysis Services Step by Step - ISBN
0735621993.But I’m willing to bet that
this should be the way forward.Not only
for the actual schools viewing the data, but also for all those staff who work
at the county level and have to inspect/report on these schools and the
county’s performance.The only thing is
that most examples are stated within companies which have sales as a primary
measure, however I’m not sure education grades, targets, etc fall into the same
scheme.So I can’t use an example
structure to help get me started.
It isn’t something that I’ve got management backing for yet,
as it’s still pretty much just my own and a colleagues’ thoughts at the
moment.Also until I know more about it
myself then I wouldn’t feel comfortable taking my case to management.I feel I will be better to create a simple
version to demonstrate and show the benefit and power of such a system, to
those who will give the backing to the project.
The main central application, which is used in the county,
contains most of the information needed.
And several other systems also contain additional data which would be
useful.All based around a
Unique_Pupil_ID.
Current transactional database
The main transaction database I would be taking data from
has a main student table, then it has a student_results table.The student table also has many lookup type
tables related to ethnic origins, deprivation codes relating to income/student
address, etc.The kind of things that
would be useful when doing analysis on students.As well as at a more broad level, being able
to compare schools to each other at various subjects/overall/etc.
So the student table contains the information about the
student, but not results data.
The results table data is similar to below, but obviously
with more to it than just these fields:
Student_ID
Subject_ID
Grade
GradePointsEquivalent
ExamYear
Then there is obviously a lookup for the subject name.
There are also things like average/points/score per student/year
which aren’t a subject score, but more of a calculated field.So where ought that data to be located table-wise?
There would be a lot more to a total DW solution for this
environment, but I wanted to start with the basics so I have somethign to demonstrate.
Can anyone offer my any advice regarding this? and a possible design structure for the warehouse with regards to which ought to be fact or dimension tables.
Or do you think I'm barking up completely the wrong tree thinking that this would be a good solution?Thanks for taking the time to read this less than short post.Kind regards,
David
Hello. I have searched on the subject "education data warehouse" and found some suppliers and solutions in this area.
You design is a good start. A student table, with facts about the student, and a fact table with each students results, by each test or whatever. You will also need a tiem dimension, a test dimension(if these tests are general). Perhaps a school and a teacher dimension can enhance the analytic value of your solution. Students change schools and teachers and teachers change schools.
In the fact table you will normally only put base measures that can be used for calculations. You do the calculations in a cube.
Your problem have a lot in common with analysis when you try to see customer profiles and their buying patters, so you can have some help with sales examples.
HTH
Thomas Ivarsson
|||Thanks for your reply Thomas.This was my initial thought on the design (although there are a fair few columns that I will not need in the final version, which are included in that diagram at the moment). See link below.
>>Structure<<
What do you think? Anything that jumps out at you regarding it?|||
Hello. The structure is fine for a data warehouse. If you build a datamart on top of this, by using views or new tables I recommend to reduce the number of joins in your snowflake.
Have a look here for more design tips(www.kimballgroup.com)
Regards
Thomas Ivarsson
|||I have made them into more of a star schema that a Snowflake and that has certainly helped.However, I have a new issue. My understanding of OLAP & AS2005 is still growing, so it's possible I've got the wrong idea here, but...
The student grades are not always numeric, in many of the subjects they are graded A-G etc. They all have points equivalent, but it would be good to look at the grades too.
Common requests about the data are how many students achieved 5 A-C grades or 5 A-G etc. And other bits of analysis that are done around grade.
So some of the data needs to be analysed as stated here and some by numeric points. Can anyone help me as to how I could structure this part of it or achieve such a result?
I'm at a bit of a loss as to how.
Thanks,
David|||
You can used named members/sets for A-C and A-G type grades. For numeric data, you can either create a band dimension or use discretization methods in SSAS. You can search for these techniques in Books online.
One other point in your structure, for languages, you can create one physical dimension and use 'role playing dimensions'.
|||Hello again. Build a separate dimension table for the grades(A-G), make a foreign key to the fact table and add a measure, called NumberOfGrades to the fact tables.
In this way you should be able to use sum as the aggregation method. This measure can then tell you how many (A-G) grades a student, teacher, school (and so on) that exist in the fact table.
HTH
Thomas Ivarsson
No comments:
Post a Comment