Blog Post

Writing high performance, security conscious SQL queries to support the demands of the user-interface

Ideas & Innovation
Writing high performance, security conscious SQL queries to support the demands of the user-interface
  • 28
    Apr
  • Image

Writing high performance, security conscious SQL queries to support the demands of the user-interface


How to avoid sql injection and building dynamic queries, achieve great performance and maintain good application design principles while supporting a highly demanding user-interface.

Application developers are often faced with slow performance plus a ton of security issues when writing sql queries. One of the reasons for this is the nature of the queries needed to support the application's user interface. Having a great user experience (UX) is paramount to having a successful product, whether internal to your company or on the open market; having an interactive and intuitive user interface plus a well thought out user experience with the queries to effectively power it is a must.

The scenario I will be addressing today is where the user makes a multi-valued selection along with other criteria and submits the request expecting a report to be returned before frustration kicks in.

Problem

The user has selected several countries and a date range using the very response and intuitive user interface and now clicks submit and expects a report showing all the orders from the countries selected within the specified period.

To solve this problem developers' needs to take the criteria and perform the necessary search(es) and then present the results to the user. Some ways developers have chosen to solve the problems are:

  1. Dynamically construct and sql query and execute this query using for example ADO / ADO.NET
    Poor performance and subject to sql injection
  2. Create a stored procedure that accepts a country, start and end date then execute this query for each country selected by the user
    Poor performance and application scale
  3. Create a stored procedure that has more than one country parameters and the query then ignores the ones that are blank
    Poor application design. Extendability & maintainance issues
  4. Create a stored procedure that takes a delimited string of country IDs and somehow incorporate the literal into the query(s) within the stored procedure.
    Possibly open to SQL injection through delimited string parameter and slight inferior performance

While these queries may work they would not provide the kick ass performance we need or in some cases abide by the security policies to prevent out application from being violated (those hackers) in ways we cannot imagine :).

Solution

We want to make one database request, take advantage of stored procedures, avoid sql injection and maintain good design practices

Here is our query:

CREATE PROCEDURE [dbo].[usp_Orders_ReadByCountryAndDate]
	@StartDate SMALLDATETIME,
	@EndDate SMALLDATETIME,
	
	/* The size can be changed or even 
		substituted with MAX, however this
		size should be sufficient. 300 countries
		X ISO Code 3 = 900 */
	@CountryIds VARCHAR(900),
	@Delimiter VARCHAR(20)
AS
BEGIN
	SET NOCOUNT ON;

	SELECT Orders.*, Country.Name
		FROM Orders
			INNER JOIN fn_SplitString(@CountryIds, @Delimiter) IdList
				ON Orders.CountryCode = IdList.Item
			INNER JOIN Country ON Orders.CountryCode = Country.ISOCode3
		WHERE Created BETWEEN @StartDate AND @EndDate
END
  1. To make one database request and maintain good design practices we pass our country IDs as a delimited string - using a variant of option 4 above but avoiding the pitfalls. We need to be wary of SQL injection when constructing our query since we are passing text acquired from user interface into our query (Your UI layer should have validated the data, but attacker may well bypass this somehow so as a second defense your business layer should have validated your input and as a third line of defense don't allow your database to succumb to such attacks).
  2. Sql injection survives on incorporating user input into your queries at a command definition level - this simply means the user input is injected into the command. To combat this we have transformed the user input into data and allow our command/query to act on the data.
  3. The clever user of an inline table-valued function and joining this table in our query leverages the power of sql server and avoids loops, table scans, scalar functions, etc that will deteriorate the performance of our query.

Here is our supporting inline-table function:

This takes a delimited string and splits it into its various parts, inserting them into a table and returns the table to the caller.

CREATE FUNCTION [dbo].[fn_SplitString]
(
    @InputList VARCHAR(8000),		-- List of delimited items
	@Delimiter VARCHAR(20) = '|'	-- delimiter that separates items
) 
RETURNS @List TABLE (item VARCHAR(8000))
BEGIN
	DECLARE @Item VARCHAR(8000)

	WHILE CHARINDEX(@Delimiter,@InputList,0) <> 0
	BEGIN
		SELECT
			@Item=RTRIM(LTRIM(SUBSTRING(@InputList,1,CHARINDEX(@Delimiter,@InputList,0)-1))),
			@InputList=RTRIM(LTRIM(SUBSTRING(@InputList,CHARINDEX(@Delimiter,@InputList,0)+LEN(@Delimiter),LEN(@InputList))))

		IF LEN(@Item) > 0
			INSERT INTO @List SELECT @Item
	END

	IF LEN(@InputList) > 0
		INSERT INTO @List SELECT @InputList -- Put the last item in
		
	RETURN
END

Results

Executing this query on an orders table with 1,000,000 records using a date range from 01/01/1991 to 01/01/2005, filtering on the following countries PRT = Portugal, SPM = Saint Pierre and Miquelon, ITA = Italy, ESP = Spain and USA = United States of America with the result and time statistics as follows:

SET STATISTICS TIME ON 

EXEC	@return_value = [dbo].[usp_Orders_ReadByCountryAndDate]
		@StartDate = '1991-01-01',
		@EndDate = '2005-01-01',
		@CountryIds = N'PRT|SPM|ITA|ESP|USA',
		@Delimiter = N'|'
		
SET STATISTICS TIME OFF

SELECT	'Return Value' = @return_value

Query Results

1st run

SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 92 ms.

SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:
CPU time = 140 ms, elapsed time = 633 ms.

SQL Server Execution Times:
CPU time = 171 ms, elapsed time = 725 ms.

2nd Run

SQL Server parse and compile time: CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times: CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times: CPU time = 219 ms, elapsed time = 466 ms.

SQL Server Execution Times: CPU time = 219 ms, elapsed time = 466 ms.

Here is our schema and create scripts (SQL Server 2008):

Table schema

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [Country](
	[ISOCode3] [nchar](3) NOT NULL,
	[ISOCode2] [nchar](2) NOT NULL,
	[Name] [nvarchar](50) NOT NULL,
 CONSTRAINT [PK_Country] PRIMARY KEY CLUSTERED 
(
	[ISOCode3] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [Orders](
	[Id] [int] NOT NULL,
	[Created] [datetime] NOT NULL,
	[Product] [varchar](20) NOT NULL,
	[CountryCode] [nchar](3) NOT NULL,
 CONSTRAINT [PK_Order] PRIMARY KEY CLUSTERED 
(
	[Id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Orders] ON [Orders] 
(
	[Created] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_Orders_1] ON [Orders] 
(
	[CountryCode] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
ALTER TABLE [Orders]  WITH CHECK ADD  CONSTRAINT [FK_Order_Country] FOREIGN KEY([CountryCode])
REFERENCES [Country] ([ISOCode3])
GO
ALTER TABLE [Orders] CHECK CONSTRAINT [FK_Order_Country]
GO

Don't forget to leave your comments and updates.

Comments (2)

Dani 16 May 2012, 11:36 PM Website

Table 1 is a roster of Bowlers who can benolg to one or many bowling centers. Table 2 is a roster of bowlers signed up for an event today. I list my bowling center roster with all names for that bowling center who are elegible to signup for the event roster (table 2) It is possible to signup bowlers at different times. After I signup a bowler and close the signup page, when I go back to the signup page I do not want to see the bowlers who have already signed up.I have tried many sql select statements, none of which produce the desired result:1) SELECT DISTINCT BH.BIDFROM BOWLERS_HOUSE BHLEFT JOIN EVENT_ROSTER ERON ER.BID = BH.BIDWHERE ER.HOUSE_ID = 40AND ER.EVENT_ID = 201007073010100101010300AND ER.BID IS NULL; Note: returns empty result2) SELECT DISTINCT BH.BIDFROM BOWLERS_HOUSE BHWHERE BH.HOUSE_ID = 40AND (BH.BID) NOT IN( SELECT ER.BID FROM EVENT_ROSTER ER WHERE EVENT_ID = 201007073010100101010300 AND ER.HOUSE_ID = 40 );Note returns all BID's in Table 13) SELECT BOWLERS_HOUSE.BID FROM BOWLERS_HOUSE LEFT JOIN EVENT_ROSTER ON BOWLERS_HOUSE.BID=EVENT_ROSTER.BID WHERE EVENT_ROSTER.BID IS NULL; Note: returns the bowlers who have aready signed up and are on Table 2, I want only the bowlers who have not ALREADY signed up that are NOT ON table 2I can run the program the way it is or I can build a very convoluted compare routine to eliminate the Table2 records from my display of the House roster, but need something that works. Oh, I tried SQL MINUS, but of course mysql disregards this very useful tool.Any ideas?Thanks in advance! a0|a0

Enis 19 May 2012, 06:57 AM Website

it's very impotent to be aware of the reersved words, I built a cms that has a filed in each table called (CONDITION) and now its reersved in mysql 5.. it took me a lot of time to alter all the tables and the php code in all the sites that uses this cms Thanks Steve

New Comment

Notify me of follow up posts