I am writing this article because I have recently come across numerous blog posts and articles discussing how you can directly query SharePoint databases. I have also stumbled upon a web part available for free that shows the most popular content among the SharePoint farm. This web part directly queries the
SharedServiceProvider database to retrieve the information needed by the web part. Moreover, I have recently downloaded a tool that grabs deleted documents from the content database and saves them back to your hard drive.
Overview of SharePoint Content Databases
Before diving into the pros and cons of querying SharePoint databases, let’s explore them and see how we can write direct T-SQL queries against them.
Below is a very high-level database diagram. I would like to extend a special thank you to Ethan for coming up with this useful diagram.
DISCLAIMER: I highly recommend executing the following queries in a non-production environment.
dbo.AllUserData is a table that holds information about all the list items for each list. Here is a sample query to retrieve the contributions of each user in a certain site collection:
SELECT tp_author, count(*) as 'Number Of Documents & Items'
Group by tp_author
And here is another query to the
SharedservicesProvider database that retrieves the most popular documents:
SELECT ANLResource.DocName, COUNT_BIG(*) AS HitCount
ON ANLResourceHits.WebGuid = ANLWeb.WebGuid
ON ANLResourceHits.ResourceId = ANLResource.ResourceId
WHERE (CHARINDEX(’.aspx’, ANLResource.DocName) = 0)
AND (CHARINDEX(’.swf’, ANLResource.DocName) = 0)
GROUP BY ANLResource.DocName
ORDER BY HitCount DESC
Well, I know it’s incredibly easy, you can extract as much data as you want and consume this Data in your custom solutions (Reports, Web Parts,…) and the sky is the limit.
- This is completely unsupported by the EULA you agreed to when you installed SharePoint.
- Your queries are not guaranteed to work after applying any patches or service packs to SharePoint since Microsoft could change the database schema anytime.
- Directly querying the database can place extra load on a server and hence performance issues.
SELECT statements against the database take shared read locks at the default transaction level so your custom queries might cause deadlocks and hence stability issues.
- Your custom queries might lead to incorrect data being retrieved.
A Debate About the Stability Issues
I recently took part in a debate on one of the forums with a guy who claims that there are no stability issues when you directly query the database since you can use the With (NoLock) clause in your TSQL Queries as follows :
Select * From dbo.AllDocs With (NoLock)
This may seem better, you can avoid the deadlocks by using (NoLock) clauses BUT you will run into another problem which is retrieving incorrect data because you will be observing the database in an intermediate state (Data are not committed yet).
Insertions and Modifications of the SharePoint Databases
It is clearly unsupported to update, delete, or insert records. The risks are surely far more obvious.
Also be aware that any database changes would definitely break the supportability as stated by Microsoft. Examples of such database changes include, but are not limited to the following:
- Adding new indexes or changing existing indexes within tables
- Adding database triggers
- Adding, changing, or deleting any primary or foreign key relationships
- Changing or deleting existing stored procedures
- Adding new stored procedures
- Adding, changing, or deleting any data in any table of any of the databases
- Adding, changing, or deleting any columns in any table of any of the databases
- Making any modification to the database schema
- Adding tables to any of the databases
- Changing the database collation
The Proper and Supported Ways for Data Access in SharePoint
The process of data-retrieval in SharePoint should be executed via the SharePoint object model or the built-in SharePoint web services and this is attributed to the following:
- Microsoft has gone through a lot of work to ensure that using the object model or web services will lead to stable and efficient database interactions.
- The likelihood of Microsoft breaking their own object model is far less that the database schema changes.
- You are not going to lose Microsoft supportability.
Direct Queries to SharePoint databases is considered a no-no, you should use SharePoint object model or web services instead. Do not worry, I will show you in later posts at my blog how to do the data access through the object model, stay tuned!
- 1st October, 2009: Initial post