Out of curosity I am asking this design type question because I am curious as to what thoughts you fine folks have on the subject matter.
I have seen many instances of a databse design where there are detail records that have a dollar value associated with them such as an invoice table which holds invoice records. I have noticed that no where is a total of the invoices stored, causing SQL to have to run through the records each time a total is needed.
In days gone by prior to SQL, data structures would be such that a total accumulator field in a data file somewhere would be updated as invoices were processed so that the system would not have to run through the individual records to derive the total of them.
I am familiar with normalization in terms of SQL as I have worked with SQL for years and of course I have work with non-SQL technologies before SQL came onto the scene. What I am curious to know is that is it such a bad thing in the SQL world to have something like an invoice total accumulator that constantly is updated as invoices are processed and queries can access this total accumulator directly rather than having to gather the individual invoice records and total them up first before presenting the total to the user?
If a total field is kept, you have the same data in two places. If you have the same data in two places, they can get out of sync.
Say I have three items at 5, 10 and 20 dollars and a total field of 35 dollars.
Sam, a student intern, edits an item and changes it from 5 to 7 dollars but doesn't recalculate the total field (because he's an intern and doesn't know).
Now the data is corrupt, the total field still has 35 dollars, but the real total is 37 dollars.
|||Actually I wasn't thinking in terms of a field that the user had access to thr the UI. Instead, a table for invoice id and invoice total, stored procedures would handle updating the invoice total in the table when the invoice itself from the invoice table was modified or new invoices added. Then when a total was needed for whatever reason the query could go right to the table and obtain the total rather than crunching the number.|||
It's a good question, but in the scenario you presented you are atucally causing much more harm than good. Here's why...
irasmith:
stored procedures would handle updating the invoice total in the table when the invoice itself from the invoice table was modified or new invoices added.
Now you are running that harsh code every time someone does an insert, or update... whereas if you just let SQL do the mathonly when it is being requested to be seen, it's much better. Also, as was mentioned above, it could get out of sync.
There is a place for storing, or "caching" the calculated value:
A friend of mine has just launched a HUGE project for the County, and in it there are certain bugeting calculations that take about 2 minutes to calculate. In this kind of invironment, caching (by storing the calculated value into a table) would not be so bad :P
That's my 2 cents :P
Peace,
|||I once worked on a project where once a day they ran a stored procedure that pulled data from another database and ran calculations on it, and then stored the information in a format that was easily read into reports, including precalculated totals. Of course the proc took about half an hour to run (don't ask) so there was no way you'd want to wait that long for a single invoice. As Nullable wrote, its pretty situational, but there are definately times you would want to 'cache' the data like this.|||Thanks to all who have posted to this thread. Naturally I am not one to run out and just do something out of the norm, however, as I work more in the design area I am finding that sometimes you have to weigh things and accept a trade off in order to accomplish the main goal. It is good to know others have had similar type issues arise before and while never an easy choice we just have to look at the specific situation and go from there.
No comments:
Post a Comment