This post contains:
- Links To Other Alert Troubleshooting Articles
- Why Scheduled Alerts Mysteriously Stop Working
- What The Immediate Alerts Timer Job Does (Five Stored Procedures)
- How To Find Alerts That Didn't Fire
- The Moral Of The Story
The SharePoint scheduled alert design hasn't changed much from WSS 3.0. This analysis applies to WSS 3.0, MOSS 2007, SharePoint 2010 and I suspect SharePoint 2013 but haven't verified it. All my research and data analysis have lead to four words:
Synchronize your server times.
If your database server and WFEs are not sync'd to the time server, your scheduled alerts will fail. No setting changes, property changes, timer restarts, moving DB processing to another WFE... or whatever will fix your problem. And the next time you do a reboot, time will sync up and things will start working again. Maddening.
Links To Other Alert Troubleshooting Articles
I'm going to start with credit to some of the posts that helped me along before I get into my stuff.
- If Scheduled alerts fail after a migration.
- How to change the Scheduled alert time.
- The humorous truth about the depricated job-daily-alerts property. As well as which timer job runs scheduled alerts (Immediate Alerts), which WFE processes which DB (select * from timerlock), and some other issues I didn't encounter as well as customization.
- Some additional things to check when troubleshooting alerts. (I just discovered the server sync time issue in a footnote at the bottom of this article. I decided to make it the star of the show.)
- Customizing the alert message templates.
- This incredibly detailed post (or two) with troubleshooting steps and database tables.
- And another detailed post here.
- Scheduled alerts are run via the Immediate Alerts timer job.
- The Timerlock table in each database tells you which WFE runs timer jobs for it.
- The property "alerts-enabled" must be set to true for each URL.
- The property "job-immediate-alerts" tells you when the job runs, typically "every 5 minutes between 0 and 59".
- Daily and weekly alert subscriptions are both stored in the SchedSubscriptions table - NotifyFreq column value of 1 = daily, 2 = weekly.
- All activity on the site is stored in the EventCache table.
- When a subscription is set on an EventCache row, an EventLog entry is created the next time the "Immediate Alerts" job runs from the timer.
- Since immediate alerts fire immediately (hence the name) there is no need to store up alerts for a daily or weekly summary.
Why Scheduled Alerts Mysteriously Stop Working
This is where we kept getting lost. There was nothing wrong with any part of the email or event processing architecture. Immediate alerts worked fine, scheduled alerts didn't go out. We needed a deeper understanding of what was really happening.
- Scheduled alert subscriptions require an intersection entity (many to many relationship) and use the EventSubsMatches table to join on the EventCache Event ID and the SchedSubscriptions Subscription ID.
- When the "Immediate Alerts" timer runs and the SchedSubscription NotifyTime has been past by the SQL server current time, all alerts joined to the subscription EventsSubsMatches table are summarized and sent out to the subscriber.
- The processed rows are removed from EventSubsMatches.
- The subscriptions where the NotifyTime has been passed by the WFE time are incremented by one unit (day, week).
If you turn on verbose for the Timer ULS log, you will see that the process is running fine but not finding anything to send out. You watch your logs and you capture:
Begin invoke timer job Immediate Alerts, id {81E07A90-327B-471A-A3A6-7A7672F1D905}, DB {7ABFF5BD-7E23-43FE-B399-F6980AA3E954}It's working! You say to yourself...
AlertsJob loaded 0 of 0 event data recordsWait, I know I subscribed to alerts and I SHOULD have seen some! Why does it say "0" everywhere?
AlertsJob loaded 0 of 0 subscription records
Alertsjob results for immediate delivery: 0 prematches, 0 passed filtering, 0 of 0 passed security trimming, 0 final after rollup
Alertsjob results for scheduled delivery: 0 prematches, 0 passed filtering, 0 of 0 passed security trimming, 0 final after rollup
AlertsJob processed 0 daily notifications in 0 digests, sent 0 emails, failed to send 0 emails
AlertsJob processed 0 weekly notifications in 0 digests, sent 0 emails, failed to send 0 emails
What The Immediate Alerts Timer Job Does
Time to fire up the SQL Profiler and run a trace. I've captured the stored procedures that the timer job uses. Now we can really see what goes on.
proc_GetEventDataAndSubscriptionFilters
This takes entries from EventCache table, processes immediate alerts
Updates EventLog
Updates EventBatch with timestamp and last processed EventCache Item ID
Nulls out the EventData and ACL columns in EventCache
proc_EnumSubscribedSites
Pulls a list of Site GUIDs that have subscriptions
proc_MatchSchedSubscriptions
For each site with subscriptions, pull the list of scheduled subscriptions that should fire
This one has a where clause that blocks my failed subscriptions
EventLog.EventTime >= @EventTime - CASE
WHEN (SchedSubscriptions.NotifyFreq = 1)
THEN 1.0
ELSE 7.0
END
proc_DeleteEventLog
For each site collection, delete from EventSubsMatchs
Where EventLog.EventTime < current time
Delete from EventLog
proc_UpdateSchedSubscriptionTimes
Adds either a day or a week to NotifyTime and NotifyTimeUTC
As you can see, I tested all the where clauses for proc_MatchSchedSubscriptions and found that this particular one was selecting against my scheduled alerts.
Then, proc_UpdateSchedSubscriptionTimes procedure runs after everything - there is no two phased commit, no logic to ensure it didn't miss anything - just an increment by one to get ready for the next summary.
This is probably a good time to say:
"Microsoft doesn't support any manual manipulation of the SharePoint content databases and recommends only using the API."
So, proceed at your own risk :D.
How To Find Alerts That Didn't Fire
Open up your SQL Management Studio, connect to your data server, and run this query
Look at your results tab and you will see all the databases where scheduled alerts that should have been sent - weren't. You will also see another curiosity - alerts with no e-mail addresses. Why would those be there? You can't send an alert unless you have an e-mail address right? This cracks me up. You can create an alert in the GUI even if your site collection user profile doesn't have an e-mail address. However, the stored procedure where clause in proc_MatchSchedSubscriptions blocks those entries. The user THINKS they are going to get alerts, nothing tells them they won't on the front end, and SharePoint just ignores them on the back end. Isn't that nice?
Ok, you now have database names with failing alerts, now open a new query window and run all these queries at once against the DB in question. This is an older screen shot:
Hmmm, you say. What next? Well, I found out that if you push the Schedule back one unit (day, week), it will fire again the next time "Immediate Alerts" timer runs. Also, if there is no e-mail address, you can blow them away (UNSUPPORTED!!). If you want to try your luck, you can use these queries. I found this would fire off my alerts 90% of the time but they would fail again on the next rotation - because the server times were off.