Insights
Troubleshooting Sitecore Experience Analytics problems: solutions for common issues
Read on to find out solutions to the three common Sitecore Analytics module performance problems
Troubleshooting performance issues in Sitecore Experience Analytics
Over time, some of our clients have encountered performance problems with the Sitecore Experience Analytics module. In this article we will highlight three common issues they’ve faced and how we resolved them.
1. Analytics data not appearing in Experience Analytics reports and charts
One common issue is when analytics data stops displaying in Experience Analytics reports, even though no errors are present in the log files, and everything appears normal. Here’s how we approached and solved the problem.
Problem overview
Symptom: Analytics data stopped showing up in Experience Analytics charts.
Initial action: We attempted to rebuild the reporting database, but the process was slow and failed to resolve the issue.
Root cause
After further investigation, we concluded that the issue was related to the Processing role, which handles the data collection for Experience Analytics. Specifically, we found that the Device Detection database located at App_Data/DeviceDetection/* on the Processing role was corrupted.
Solution
- We deleted the corrupted database file, which triggered its automatic regeneration.
- After regenerating the file, we rebuilt the Reporting database, and data successfully reappeared in the Experience Analytics reports.
2. High load on Shard databases
Performance bottlenecks on Sitecore’s Shard databases can be challenging, especially for clients running minimal Sitecore cloud configurations as per their recommendations on Cloud Topologies and Tiers. High database load can occur suddenly, and in some cases, may be a result of a DDoS attack.
Problem overview
Symptom: High load on Shard databases, with database DTUs reaching 100%.
Possible cause: A DDoS attack, characterised by a surge in traffic or interactions from suspicious sources.
Diagnosis
To confirm if abnormal traffic is causing high load, you can run SQL scripts on each Shard database to identify suspicious contact interactions.
Useful SQL scripts:
- Identify contacts with the most interactions:
SELECT TOP (100) ContactId, COUNT(ContactId) as Count
FROM [xdb_collection].[Interactions]
GROUP BY ContactId
ORDER BY Count DESC
- Check if interactions are happening in real-time:
SELECT TOP (100) *
FROM [xdb_collection].[Interactions]
WHERE ContactId = [from above result]
- Get more details about the suspicious contact:
SELECT *, CONVERT(VARCHAR(MAX), Identifier) as IdentifierString
FROM [xdb_collection].[ContactIdentifiers]
WHERE ContactId = [from above result]
SELECT *
FROM [xdb_collection].[ContactFacets]
WHERE ContactId = [from above result]
SELECT TOP(100) *
FROM [xdb_collection].[InteractionFacets]
WHERE ContactId = [from above result] AND FacetKey = ‘IpInfo’
If the IP originates from an unusual region and sends excessive requests, it’s time to ban the IP and remove the contact and its data.
Solution
- Block suspicious traffics: Configure the ExcludeRobots file to filter out unwanted visitors based on user agent or IP address. Here is how you do it.
- Remove excessive contacts:
- For Sitecore XP 10.1 and later, use the xConnect purging tool.
- For Sitecore XP 9.2 and later, remove contacts via API as described here: Deleting contacts and interactions from the xDB.
- Consider using the ADM module as described here: How to purge historical analytics data.
3. CPU consuming queries on Reporting Database, in Azure
Another challenge involves high CPU and DTU usage on the Reporting Database, particularly in high-traffic environments. The following SQL query, automatically generated by MS Azure to update database statistics, was found to consume excessive resources:
(@samplePercent float)SELECT StatMan([SC0], [LC0], [SB0000]) FROM (SELECT TOP 100 PERCENT [SC0], [LC0], step_direction([SC0]) over (order by NULL) AS [SB0000] FROM (SELECT CONVERT([nvarchar](200), SUBSTRING ([FacetData], 1, 100)++substring([FacetData], case when LEN([FacetData])<=200 then 101 else LEN([FacetData])-99 end, 100)) AS [SC0], datalength([FacetData]) AS [LC0] FROM [xdb_collection].[ContactFacets] TABLESAMPLE SYSTEM (@samplePercent PERCENT) WITH (READUNCOMMITTED) ) AS _MS_UPDSTATS_TBL_HELPER ORDER BY [SC0], [SB0000] ) AS _MS_UPDSTATS_TBL OPTION (MAXDOP 8, RECOMPILE)
Problem overview
Symptom: High CPU usage on the Reporting Database, particularly during peak traffic times.
Cause: Update statistics are set to run automatically during high-traffics periods.
Solution
We disabled auto-update statistics and set up a SQL Server Maintenance Plan to run the Update Statistics task during off-peak hours.