One of the things I've gotten out of one of my business parters in Vsched (that online employee scheduling software I'm working on), is a bunch of best practices relating to managing lots of data. One of these is using start and end dates on critical data, rather than actually deleting this data.
What my partner suggests doing is to use
end_date fields containing effective-as-of and effective-until timestamps. Then, if you need to delete this data, you
end_date field with the current timestamp instead of using a DELETE query.
It does make queries over this data more complex: you need to add another
WHERE predicate (e.g.
start_date >= NOW() AND (end_date IS NULL OR end_date < NOW())). So that might be an issue, but hopefully you're looking at best practices before you implement, and you can always worry about perf later, right ;)
This will provide your system a better audit trail. You want to know who did what to your customer data and when they did it. If you rely on the method described in my 5Ws of database design post, you'll lose that trail when you do your
DELETE. But if you just
UPDATE that row appropriately, you're safe.
This will also allow an undo feature for those nasty delete features users claim to want. I say "claim", because my users claim to want to delete their data. But I know as soon as I add the feature, I'll get a bunch of them accidentally deleting data. And what will they do when they delete it? They'll curse their bad luck. Then they'll email my tech support list. Then they'll send me an email. Then they'll call me. Right when I'm about to sit down with friends and a drink. So much for that evening. But wouldn't it be better if they could undelete their own data? Yes. It would be better.
Well, as a matter of fact, I've got such a delete feature. But I didn't implement it with
DELETE. I used the above
end_date scheme. And it works great. Except I forgot the front end :). So when that accidental delete happened, I still got the call. But I could check the
last_updated_by fields. Long story short, I found those 300 deleted rows. And with a single
UPDATE query we were back in business.