Then came the inevitable weekend of trouble. The set of SQL Server 2005 databases, which provide the data for the reports, failed to load due to trouble with the source legacy systems. However, since the data in the SQL Server databases still contained yesterday's data, the report schedule had no problem completing the scheduled reporting.
Therein lay the problem. I did not want any reports to generate, or worse yet email to recipients, if the recent day of data was not included.
The solution is fairly simple. The SSIS packages that load the SQL Server databases were logging to a table, and included a special status column. When each database was finished loading and had completed any subsequent updates required, a log entry with a status of DONE and UPDATED were entered. I just created a stored procedure to scan the log for databases that were scheduled to be loaded, filtered by UPDATED, and with a log time that fit into the correct date/time range for the current day's data load.
If the stored procedure found any records that did not meet the above criteria, it returned a result reporting accordingly. Then I just added a call to this stored procedure in my primary scheduled reporting stored procedure, where the list of scheduled reports to process is generated. If the data load check reported problems, I prevent the scheduled reports from processing. Then once the problem with the data load is resolved and all data is available, all I have to do it run the SQL Agent job that runs the scheduled reports and this time it will succeed.
I'm always a fan of a little more work in the back end to make a little less work for the routine. Not to mention, you don't have to answer Fred's email asking, "Why did I get two emails of the same report, and which one is right?"
No comments:
Post a Comment