Run SQL Queries on Excel Sheets

author-image
PCQ Bureau
New Update

Data in an organization is typically stored on multiple platforms and formats. A lot of useful data, however, is stored in non-traditional databases, such as e-mail (plain text) and Excel worksheets; besides DBMS and RDBMS systems, like Access, DB2, SQL Server and Oracle. Analysis of such data becomes difficult in the absence of a method to collect them at one place without needing to first convert them. Here's where a feature like MS SQL Server's 'linked server' helps you perform centralized information analysis from disparately stored data.

Advertisment
Direct Hit!
Applies to: Database administrators
USP: Analyze data lying in multiple databases and Excel Sheets 
Primary Link:
http://msdn.microsoft.com/library/en-us/adminsql/ad_1_server_4uuq.asp
Google keywords:
mssql linked server 

True, there is another feature in MS SQL that can be used-Replication. But this is generally not advisable especially when the data sets are large. Replication is used where there are smaller data sets and the subset of the data to be used is well known. What Replication does is, it moves the data from one database to the other. Whereas in linked servers, the data is not moved anywhere else, instead it is directly accessed or modified from the original location itself. In this article, we'll discuss about linked server and how you can make use of it in a typical scenario. 

What is a linked server?

Linked server is an alias on your SQL Server instance that points to an external data source that can be anything from a plain text file to another RDBMS database. This link works over an OLE or ODBC transport layer. The database can be accessed directly from the SQL Server. You can code your applications to fetch data from an external database, using this link. This technique is similar to the 'linked table' feature in MS Access. 

Advertisment

Linking the Excel books

In the linked server's Property Sheet provide the Excel file's 

location and also set the provider as 'Excel 8.0', or it won't work

Let's take up a scenario where we have two Excel workbooks, with one sheet each. One sheet called 'vendor' contains vendor-contact information. The other called 'ven_pro' has a list of products from each vendor. Both the sheets have one common value-the Vendor ID. As a task, we create links to both these workbooks in SQL Server and then use the SQL Server's Query Analyzer program to fetch and mix data from the two. We need to make sure that the DTC (Distributed Transaction Coordinator) service is running on the SQL Server machine.

Open the 'SQL Server Enterprise Manager' console. Navigate to the Security folder and then to the 'Linked Server' sub-item. Right click on it and select 'New Linked Server', name it, say, 'EX1'. Select the 'Other Data Source' option and select 'Microsoft JET 4.0 OLE DB Provider' from the dropdown. Type in 'Excel' as the product name, key in C:\vendor.xls as the full path to the first Excel book and 'Excel 8.0' for the provider string. Click on OK to create the first link. Repeat the steps for the second book, providing 'EX2' as the name.

Advertisment

Now, each link will show two objects-Tables and Views-under them. 

Opening the Tables node will display the sheets in that book and opening Views will show you the data in those sheets. That done, it is time to see if the links actually work.

Simple SQL query 

Let us first do a select all on one book to check our settings. Fire up 'SQL Query Analyzer' and login to the SQL Server ('sa' or any login created with permissions to access EX1 and EX2 will do) and the master database. For our test query, type in (note the ellipses):

The result of an SQL query that analyzed the two Excel Sheets simultaneously

Advertisment

SELECT * FROM EXCEL_LINK1...Sheet1$

Press F5 to execute the statement. The result grid should display all the rows from the vendor contact sheet.

Complex SQL query

Let us now go one step further and query both the sheets and display the combined data. Doing so would be simple given that both sheets have a common field (VENDOR_ID). Our query will be as follows.

SELECT vendor, address, phone, city, country, zip FROM EXCEL_ LINK1...Sheet1$ a,EXCEL_LINK2...Sheet1$ b WHERE a.Vendor_ID= b.Vendor_ID 

Advertisment

Press F5 again to note the results. 

Similarly, you can use the SQL statements to quickly retrieve data when needed without having to import and convert them into SQL Server. 

Doing so will also let you access various external databases from a single point. 

Sanjay Majumder

Stay connected with us through our social media channels for the latest updates and news!

Follow us: