2012年3月9日星期五

Fix mistake in fact table

Following is the business problem:

- say there are invoices coming from customers
- those invoices are assigned to accounts
- invoices are assigned to wrong accounts and this is not discovered for months
- users correct the mistakes and the way the source system is built - all invoices are reassigned to the NEW account - NO CHANGE DATE though

So, say they found that the account is wrong and fix mistake that updates invoice records all the way to 8 month ago.

What is the right thing to do?

Our DW already contains all the invoices assigned to wrong account. Create another record and time stamp it? Then add "as of" parameter to reports?

Also, there are aggregate tables created for history - those need to be recalculated.
I'm looking for best practice here, rather than details.

Thanks!According to Ralph Kimball there are two methods to handle correcting facts, physical and logical deletes.
With the physical delete the incorrect record is deleted and the new correct record is added. Any downstream application would then need to recalculated. With the logical delete, you add a flag field to all records, insert the correct records, and mark the incorrect flags as such (ie Deleted = 1 for deleted records, Deleted = 0 for correct records). You may also want to add a DeletedReason field to track the error reasons. Again any downstream application would then need to be modified to look only for correct records.
The logical delete method is the "safe" method, because the DW still contains all pertinent history. The choice between the two is really a function of whether the history needs to be maintained vs the effort to implement the Deleted flag in all downstream apps.
Another choice (although may not have one) is the mechanism to process deleted/corrected facts. You can have either a manual or automated process -- the automated process is only viable if your source system has some type of audit/journal/logging mechanism. Even if your system contains a record of the change, depending on the frequency of the errors you may not want an automated process.
In a perfect world, I would implement a logical delete with a manual correction methodology. If errors occur enough to warrant an automated process, perhaps you ought to focus on the process in the source system causing the errors. Even if I did implement an automated error correction method, I would certainly implement some kind of error correction KPI. That way management can track the number of errors and work to impement better controls.
Although I probably didn't answer your question, I hope I've given you some food for thought.
Larry
|||I think you did. Thank you.

I've been playing with similar ideas. I wanted to see what other people do..so keep posting everyone!

没有评论:

发表评论