Wednesday, August 27, 2008

Using PreparedStatement for variable number of parameters / handle SQL statements or queries with "IN" Clause

Following article discusses on using PreparedStatement to work on SQL queries involving clauses of type "IN"

Basic flow behind executing a SQL Statement:

* First, DBMS or RDBMS will compile the SQL statement
* Secondly, Execute the SQL statements

PreparedStatement in Java is an object representation of pre-compiled SQL statement unlike normal Statement in Java. Usually PreparedStatement are considered faster upon comparison with Statement but only when there are few iterations completed before they can actually catch-up with Statement otherwise only Statement gives better performance for fewer execution cycles.

Effective usage of PreparedStatement:

* Use them for SQL statements which will be executed multiple times.
* Use them for SQL statements that takes parameters though it can be used for statements without any parameters. This way you can use the same statement with different values to execute it.
* Use them for SQL statements that has parameters that are likely to have special characters so that we dont have to worry about handling parameters having quotes or any special characters.

Problem of SQL statement with variable number of parameters:

When we have a SQL statement which takes parameters with special characters and has variable number of parameters, there arises a problem on how to supply values for the PreparedStatement parameters (question mark placeholders) which are dynamic or variant in nature. Statement object might be a better choice to handle such cases as SQL statement can be dynamically constructed but problem would be to handle special characters.

Solution:

Solution to the above problem is to dynamically construct the PreparedStatement parameters using plain java programing and then supply values the same way in which variable parameters are constructed. Just remember though doing this way might not be having significant performance improvement as compared to its actual features, this helps in handling special characters and for scenarios where SQL statement has same set of parameters. That is, the pre-compiled SQL statement feature is used to some extent only when it has the same set of parameters passed as for variable parameters are unchanged. In case of parameters changing, anyway it will try to pre-compile for each change preventing it to effectively use its own feature.

Example: Implementation for the above solution:

SQL statement: SELECT * FROM EMPLOYEE WHERE COMP_NAME IN ("Jame's Consulting", "D'Souze Inc.", "Build-factory (P) Ltd")

Above SQL statement uses "IN" clause which might have variable parameters of company name (COMP_NAME). Based on the user input, the parameters might be one or more. Though we can use Statement here, we need to handle special characters like single quotes or any other.

Approach 1: Construct PreparedStatement for the above scenario assuming single parameter

....
String compNames[]={"Jame's Consulting", "D'Souze Inc.", "Build-factory (P) Ltd"};
String sqlQuery="SELECT * FROM EMPLOYEE WHERE COMP_NAME IN (?)";
PreparedStatement pstmt = con.prepareStatement(sqlQuery);
for(int i=0;i<compNames.length;i++)
{
pstmt.setString(i, compNames[i]);
ResultSet rs=pstmt.executeQuery();
}
....

This approach will have to execute the PreparedStatement for each and every company name which does not actually fit our requirement of using SQL statement with "IN" clause effectively. Also there are multiple network calls to execute multiple queries.

Approach 2 (Recommended): Construct PreparedStatement for the above scenario with dynamically constructing the number of parameters

....
String compParams="(";
String compNames[]={"Jame's Consulting", "D'Souze Inc.", "Build-factory (P) Ltd"};
String sqlQuery="";
PreparedStatement pstmt = con.prepareStatement(sqlQuery);
for(int i=0;i<compNames.length;i++)
{
compParams=compParams+"?,";
}
compParams=compParams.substring(0, compParams.lastIndexOf(","))+")";
sqlQuery="SELECT * FROM EMPLOYEE WHERE COMP_NAME IN "+compParams;
PreparedStatement pstmt = con.prepareStatement(sqlQuery);
int j=1;
for(int i=0;i<compNames.length;i++)
{
pstmt.setString(j, compNames[i]);
j=j+1;
}
ResultSet rs=pstmt.executeQuery();
....

This will effectively use the SQL statement with "IN" clause by executing the query only once. However as discussed earlier, this approach will help only in using PreparedStatement to handle special characters if the parameters are variant in nature for different calls other than other typical advantages of PreparedStatement.

Monday, August 25, 2008

Web Site Optimization Techniques using HTTP Compression

Following article covers the basic web site optimization techniques for better, faster web page loading that would save significant download time.

Using HTTP Compression:

This HTTP Compression technique uses the GZIP compression which is simple and one of the effective way to improve performance of web page loading.

Problem:   Whenever a particular web site is accessed from browser, it loads the particular page by making a request to the web server. Response content takes its time to get transferred to the client browser through the network.

Following is a typical request/response sequence without compression.

•   Client Browser: Makes a request for particular web page
•   Server: Processes the request and sends the HTML response to the browser over the network
•   Client Browser: Renders the HTML response whose content is of some specific size.

Size does matter of page loading time as the browser has to render the response sent by the web server over the network. If the response content is of big size, then it will take more time to load as the entire content has to be transferred over the network (intranet/internet).

Solution:   Efficient way to overcome the above problem is to compress the response content (zipping the content) instead of sending the actual content. This way we can save bandwidth and download time.

Following is a typical request/response sequence with compression.

•    Client Browser: Makes a request for particular web page with information that it accepts compressed content as part of the header information’s. [Accept-Encoding: gzip or deflate (two compression schemes)]
•    Server: Processes the request and sends the HTML response with information to the browser that it has compressed content as part of the header information’s. [Content-Encoding: gzip]
•    Client Browser: Would download the zipped/compressed content, decompress/extract and then render which takes relatively less time.



Usage: Enabling compression is one of the fastest ways to improve your site’s performance. But probably use it for compressing HTML, CSS and JavaScript as other contents like images; media files are already most likely compressed.

Note:
•    Few older browsers like MS IE 4.0, Netscape 1.0 on MS Windows 95 have problems with this compression technique. However most of the traffic now comes from new browsers in order to utilize the latest technology advancement, this technique is recommended and is widely been used nowadays.
•    When you use the "Content-Encoding: gzip" HTTP header, the "Cache-Control: no-cache" header may not work on Internet Explorer version 5.5 Service Pack 2 (SP2) and 6. Meaning Content is always cached.

Reference Links:

http://developer.yahoo.net/blog/archives/2007/07/high_performanc_3.html
http://www.http-compression.com/

Friday, August 8, 2008

Best Practice: Basic performance tuning of SQL statements in Oracle

Though this topic seem to be already known to many developers (mainly Oracle developers), I decided to blog on this so as to provide a quick start on basic tuning of SQL statements for Java developers. Reason being most Java developers always tend to concentrate on thier core areas and do not provide much focus on back-end related tasks. Either they write SQL statements sometimes at high performance cost without thier knowledge or highly depend upon Oracle experts to tune the SQL statements later.

Understanding of Oracle execution plan:

For every SQL statements executed to read or write data, Oracle does the following

• Parses the statement
• Builds a execution plan for that particular statement
• Processes the statement

Using Explain Plan for logging execution plan:

EXPLAIN PLAN command is used to determine the execution plan Oracle follows to execute a specified SQL statement without actually executing it. This means EXPLAIN PLAN runs very fast even if the statement being explained is a query that takes more time to run. So the actual statement is never executed by EXPLAIN PLAN.

This important diagnostic tool will help to identify in-efficient SQL statements by which appropriate tunning on them can be performed. Running this command inserts data describing each step of the execution plan into a work or specified plan table behaving as a sample output table containing all needed sequence of information when SQL statement gets executed.

To log the execution plan, execute the below SQL statement on any SQL select statement.

Syntax (default table): explain plan for [sql_statement]

Example:
SQL> explain plan for select * from emp where ename='smith';

Syntax (user-defined table): explain plan into [user_defined_table] for [sql_statement]

Example:
SQL> explain plan into user_execute_plan for select * from emp where ename='smith';

One other way to get these execution information is to set AUTOTRACE ON which will actually run the SQL statement unlike EXPLAIN PLAN and produces the execution plan of that SQL statement. This will take the same time as executing the SQL statement as it actually runs the statement to output the result along with the execution information and statistics.

Example:

SQL> set autotrace on;
SQL> select * from emp where ename='smith';

To view the execution plan, execute a simple SQL select statement on the plan table or use the script utlxpls.sql provided by Oracle.

Example:
SQL> select operation, option, object_name from plan_table;
OR
SQL> @$oracle_home/rdbms/admin/utlxpls.sql;

Output operations of execution plan from the work table:



In this case, TABLE ACCESS is made first with a full table scan and then results are fed to SELECT STATEMENT. Note that Full table scan means the entire table is accessed first before select is made.

Basic tunning of SQL statement using index:

Based on the output of the execution plan, SQL statements can be tunned for better performance. The execution plan generated helps in identifying whether the operation TABLE ACCESS uses full table scan which may not be as appropriate for good performance. To prevent this full table scan, create an index on the column. If we only query fields of a table that are already in an index, Oracle doesn't have to read the data blocks because it can get the relevant data from the index:

Other ways to use execution plan is to determine the cost (CPU/IO) of executing that statement.

Now create an index and log the execution plan:

SQL> create index emp_index on emp(ename);
SQL> delete plan_table;
SQL> explain plan for select * from emp where ename='smith';
SQL> select operation, option, object_name from plan_table;

Output operations of execution plan from the work table after creating index:



In this case, index (EMP_INDEX) is used first, then used for TABLE ACCESS with range scan. It is not done by a full table scan but rather by using data row id. Range scan means index was used and it can return more than one row.

Now use the indexed column and log the execution plan:

SQL> delete plan_table;
SQL> explain plan for select ename from emp where ename='smith';
SQL> select operation, option, object_name from plan_table;

Output operations of execution plan for select on indexed column from the work table after creating index:



If we only query fields of a table that are already in an index, Oracle doesn't have to read the data blocks because it can get the relevant data from the index without TABLE ACCESS thus improving performance of the SQL statement.

NOTE: If the table has a constraint set, then instead of range scan, it will use unique scan which means index was used and it returns exactly one row.

EXPLAIN PLAN statement is a data manipulation language (DML) statement, rather than a data definition language (DDL) statement. If you want to keep the rows generated by an EXPLAIN PLAN statement in the output table, then you must commit the transaction containing the statement to maintain those data for future analysis purposes as well. If you decide not to keep the rows generated by EXPLAIN PLAN statement in the output table, then you must rollback the transaction or delete those rows from the plan table.

There is no special system privileges required to use the EXPLAIN PLAN statement if you have just the insert privileges on the plan table. However you must have sufficient privileges to execute the statement which you are trying to explain.

Wednesday, August 6, 2008

Understanding the Non-Functional Requirements (NFR)

Requirements gathering is broadly classified into two categories namely functional requirements and non-functional requirements. Usually functional requirements concentrates mainly on business functionality of the proposed application and non-functional requirements mainly focuses on service level requirements. Business Analyst usually does the gathering of functional requirements and it is the role of the technical guy "Architect" who works along on gathering the non-functional requirements which would help during designing the system.

For developing an enterprise application, apart from gathering functional requirements we should also take care of various non-functional requirements (NFR) as well. These requirements are service-level requirements which should be discussed and worked with stack-holders during the Inception and Elaboration phases of a project to define quality service level measurements for each of the service-level requirements. While designing architecture, the following service-level requirements should be addressed:

• Performance
• Scalability
• Reliability
• Availability
• Extensibility
• Maintainability
• Manageability
• Security

Performance:

Performance requirements are usually measured as below.
• In terms of response time for a given screen transaction per user.
• In terms of transaction throughput, which is the number of transaction in a given time period, usually one second.
• For example, a particular screen transaction should have a response time of 2-5 seconds or a throughput of 50 transactions per second.

Scalability:

• It is the ability of a system to support the required quality of service when there is an increase in load without changing the system.
• To better understand scalability, we should know about what the system capacity is about. Capacity is usually the maximum number of processes or users that the system can handle still maintaining the quality of service.
• For example, during increase in load of the system, if the system meets the required performance of response time as five seconds, then the system is scalable. But if the system makes more than 5 seconds, then it denotes that system is not scalable.
• In other words, when the system is on its capacity and could not respond within the acceptable limits, it denotes that system has reached its maximum scalability. Thus to make the system scale, additional hardware should be done. Adding hardware can be done horizontally or vertically.
• Horizontal scaling means adding one more machine.
• Vertical scaling means adding additional processors, memory, disk space etc within the same machine. This type of scaling is much easier than horizontal scaling as adding processors, memory, and disk space does not impact the architecture of the system.


Reliability:

• It ensures the integrity and consistency of an application and all its transactions.
• When load increases, the system should still continue to process requests and handle requests as accurately as before the increase in load.
• If the system is not reliable during load increase, then it is not scalable.

Availability:

• It ensures the system/services are always accessible.
• Even if the components fail (reliability fails), the system will still be available due to the failover capabilities by having the components redundant.


Extensibility:

• It is the ability of a system to add new functionalities or modify existing functionality without impacting the existing functionality.
• You cannot measure extensibility when the system is deployed, but it shows up the first time you must extend the functionality of the system.
• You should consider the following when you create the architecture and design to help ensure extensibility: low coupling, interfaces, and encapsulation.

Maintainability:

• It is the ability to correct issues in the existing functionality without impacting other components of the system.
• You cannot measure extensibility when the system is deployed.
• When creating an architecture and design, you should consider the following to enhance the maintainability of a system: low coupling, modularity, and documentation.

Manageability:

• It is the ability to manage the system to ensure the continued health of a system with respect to scalability, reliability, availability, performance, and security.
• Your architecture must have the ability to monitor the system and allow for dynamic system configuration.

Security:

• It is the ability to ensure that the system cannot be compromised.
• Security includes not only issues of confidentiality and integrity, but also relates to Denial-of-Service (DoS) attacks that impact availability.
• Creating an architecture that is separated into functional components makes it easier to secure the system because you can build security zones around the components. If a component is compromised, then it is easier to contain the security violation to that component.