Wednesday, August 02, 2006

How to archive data from my production database?

How to archive data from my production database?

Is there a tool or command built-into SQL Server for this?I often see this type of questions in the SQL Server newsgroups and mailing lists. So, decided to come up with a simple article, that's just good enough to get you started.Archiving is the process of getting rid of old data (or data that is not valid anymore) from the main production databases. There is no built-in command or tool for archiving databases. In this article I'm going to show you how to implement this process.Why archive in the first place? By archiving old data, you can keep your databases smaller and more manageable. With smaller tables, your index rebuild times and backup/restore times will be under control. You will also see performance gains on certain queries that either scan a larger portion of the table or clustered index.Depending on the business needs, one might choose to either delete the unwanted data, or copy the data to a history table and delete that data from the production database.Implementing an archival process is not that complicated. The three logical steps involved in this process are:
Identifying the data to be archived.
Optionally copying the identified data to a history table.
Deleting the identified data from the production database.
You may not always be archiving just a single table. You need to understand how different tables in the database are related to each other through foreign key relationships and then, sequence your delete/archive process, so that you are not violating any of the constraints.
Here's an example scenario of archiving all the orders taken more than 6 months ago, from the Orders and OrderDetails tables. The following stored procedure will copy orders older than 6 months into similar tables of a different database called Archive, and delete those rows from the main database: CREATE PROC dbo.ArchiveData( @CutOffDate datetime = NULL)ASBEGIN SET NOCOUNT ON IF @CutOffDate IS NULL BEGIN SET @CutOffDate = DATEADD(mm, -6, CURRENT_TIMESTAMP) END ELSE BEGIN IF @CutOffDate > DATEADD(mm, -3, CURRENT_TIMESTAMP) BEGIN RAISERROR ('Cannot delete orders from last three months', 16, 1) RETURN -1 END END BEGIN TRAN INSERT INTO Archive.dbo.Orders SELECT * FROM dbo.Orders WHERE OrderDate < @CutOffDate IF @@ERROR <> 0 BEGIN ROLLBACK TRAN RAISERROR ('Error occured while copying data to Archive.dbo.Orders', 16, 1) RETURN -1 END INSERT INTO Archive.dbo.OrderDetails SELECT * FROM dbo.OrderDetails WHERE OrderID IN ( SELECT OrderID FROM dbo.Orders WHERE OrderDate < @CutOffDate ) IF @@ERROR <> 0 BEGIN ROLLBACK TRAN RAISERROR ('Error occured while copying data to Archive.dbo.OrderDetails', 16, 1) RETURN -1 END DELETE dbo.OrderDetails WHERE OrderID IN ( SELECT OrderID FROM dbo.Orders WHERE OrderDate < @CutOffDate ) IF @@ERROR <> 0 BEGIN ROLLBACK TRAN RAISERROR ('Error occured while deleting data from dbo.OrderDetails', 16, 1) RETURN -1 END DELETE dbo.Orders WHERE OrderDate < @CutOffDate IF @@ERROR <> 0 BEGIN ROLLBACK TRAN RAISERROR ('Error occured while deleting data from dbo.Orders', 16, 1) RETURN -1 END IF @@TRANCOUNT > 0 BEGIN COMMIT TRAN RETURN 0 END END
You just need to call this stored procedure without any parameters, to get rid of all orders older than 6 months. As you can see, you can pass an optional date parameter, to customize the archiving process. Care is taken not to delete orders placed in the last three months. Depending on your requirements, you can schedule this stored procedure as a job, using SQL Server Agent, so that it runs every week or fortnight.This type of jobs tend to block the table while they are running. So, it's better to schedule these jobs for off-peak hours. Also, keep an eye on the transaction log space utilization. In case of any issues with log space, consider archiving in smaller batches. Here's one thing you should consider: If you are using transactional replication to maintain a reporting database, data will be deleted from the reporting databses also, once you run the archival stored procedure. If that's not what you wanted (that is, maintain complete data in the reporting database), then you might want to consider publishing stored procedures, instead of tables, and customize the archiving stored procedure on the subscriber, so that it does nothing.

Best Practices - SQL Server 2000

Best Practices - SQL Server 2000
SQL Server is a very powerful tool when used properly. It can also come to a screeching halt if left to rot with no maintenance and poor planning. While the program itself is highly scalable it is still subject to performance bottlenecks and slow response times caused by inattentive administrators and developers. I have learned much about SQL Server 2000 in the past year many of those are the best practices used by other developers and senior administrators. I hope to share that information with you now so that those just starting out can learn from what I have learned.

Best Practice #1
Download and install and actually use the SQL Server Best Practices Analyzer tool provided by Microsoft.
The Reasoning:
This tool will scan your databases for any code or implementation issues that do not conform to Microsoft Best Practices standards. This should be the starting point on any existing or currently in production database you may have. Now you can take each recommendation with a grain of salt as the tool is probably not aware of every situation a developer may face. So therefore it is always up to the developer or administrator to decide which practices to put into place.

Best Practice #2
Never start the name of any stored procedure with the SP prefix.
The reasoning:
All system stored procedures start with the SP prefix. Naming your stored procedures in this manner will cause potential clashes as service packs are released potentially with the exact same naming as your previous stored procedure. This is highly unwise.

Best Practice #3
Apply the latest service packs and security packs
The reasoning:
With so many potential threats against a database keeping your system up to date will ensure data integrity. Keeping data integrity should be the duty of anyone either developing on the database or administrating the database.

Best Practice #4
Keep your result sets that you return from your database as small as possible.
The Reasoning:
Not only does this greatly improve performance but it makes the database much more scalable and better able to handle more concurrent users.
Best Practice #5
Avoid the Insert statement when performing bulk inserts into your database.
The reasoning:
The DTS or BCP utilities are far better for inserting information in bulk into SQL Server. These utilities are far more flexible then the SQL Bulk Insert statements you may want to use.

Best Practice #6
Keep your stored procedures as small as possible.
The reasoning:
If two users are accessing the same stored procedure at the same time then two query plans will be stored in the cache. It is far better to have smaller stored procedures call other stored procedures then one very large stored procedure. This practice makes maintaining the code a bit easier as well.

Best Practice #7
Analyze all your query plans using the SQL Query Analyzer to make sure they are performing at optimum speed.
The reasoning:
Getting to know how to use the SQL Query Analyzer is one of the best things any serious developer can do to improve performance in an application. Using this tool you can see where are the bottlenecks in your code and thereby increase performance by altering indexes or even re-writing stored procedures.

Best Practice #8
Always code for multiple user scenarios
The Reasoning:
If you plan ahead and design your database with the proper data concurrency issues already solved then the future up keep of your database will be minimized. While the upfront costs associated with this might be higher the potential payback can be great as a application changes in its lifecycle.

Best Practice #9
Use User Defined Functions wisely and sparsely.
The reasoning:
It is far better to use stored procedures in your code then User Defined Functions. While using these functions may at times increase the performance of your database it is more likely that they will be converted into stored procedures in the future as the database matures.

Best Practice #10
Do not use Select * in your query design. Instead make sure you use the proper column names in the query.
The reasoning:
Using the proper column names decreases network traffic, takes less load on the database and hence can greatly improve performance.
Best Practice #11
Avoid the use of nullable columns.
The reasoning:
The use of the nullable column consumes an extra byte on each column used. Furthermore when querying data there is much more overhead with nullable columns. Try to use alternative methods when designing a database to allow for a representation of zero data in the column.

Best Practice #12
Analyze and avoid deadlocks at all costs.
The reasoning:
It is far better to access your data the same way each time you query your database. Doing otherwise will create a deadlock situation when one process takes control while another process is fighting for the same control over the objects. This can greatly tie up resources and can cause your program to crash if not taken into account. Always try to avoid a deadlock before one occurs. Use hints on your queries to make sure they are performing the way you want them to.

Best Practice #13
Create indexes on highly selective columns.
The reasoning:
It is far better to create an index on a highly selective column as this will increase the performance of your database design.

Best Practice #14
Avoid using cursors or use cursors very wisely.
The reasoning:
Cursors consume far too much database resources to be considered a viable option in most cases. There are other options available and proper design of the program will account for this. However when you need to use cursors, when there is no other alternative, then use them very wisely and make sure you research any issues with cursors before implementing them in production. Test the database under a realistic load scenario.

Best Practice #15
Always make sure your database is as normalized as possible.
The Reasoning:
This is database design 101 here folks. If you have an un-normalized database design make sure to normalize the database design to the 3rd normal form as this is considered to be the standard. The only excuse for a non normalized design is for performance reasons. However my argument with this should be that denormalizing a database schema should be considered a last resort.
Best Practice #16
Remember to SET_NOCOUNT_ON at the beginning of your SQL bataches, stored procedures, triggers, etc.
The reasoning:
Doing this will increase performance by reducing network traffic. Setting SET_NOCOUNT_ON suppresses the messages regarding how many rows are affected after executing INSERT,UPDATE, SELECT and DELETE statements.

Best Practice #17
Avoid the use of the TEXT and NTEXT datatypes in your database design.
The reasoning:
There are far too many issues associated with TEXT and NTEXT datatypes for them to be of any great use to you. Instead it is far better to use the varchar and char datatypes instead.

Best Practice #18
Do not store BLOBS in your database.
The reasoning:
A database was not designed to stored datafiles or images. Instead it is far better to store the location of these files inside the database and let the operating system handle the file I/O for you. This is far better way to store large data files in your database.

Best Practice #19
Always perform referential integrity checks and data validations using constraints such as the foreign key and check constraints.
The reasoning:
It is far better to use constraints as opposed to triggers when performing referential integrity checks. The use of triggers should only be used to perform custom data validation that can not be performed using constraints.

Best Practice #20
Make sure all your stored procedures return a value indicating their status.
The reasoning:
Make sure you standardize on the return types your stored procedures should return indicating either success or failure. Doing this will increase the maintainability of the code and make programming against it much easier. This is especially ture if everyone understands and follows the standards.

Best Practice #21
Make sure you start each clause of your SQL statement on a new line.
The reasoning:
This makes the SQL code much more readable. You should consider this especially if you are in a team environment or a visiting consultant.
Best Practice #22
It is always best to avoid the use of column numbers in the ORDER BY clause.
The Reasoning:
Using some column numbers does not increase the performance of your query by any significant amount and also this makes your code harder to read especially in large queries.

Thursday, May 18, 2006

Another Certificate !!

Hi,
Guys another certificate from BrainBench.
I like this site very much as it tells you about the strengths and weakness of a particular topic.
Here is my transcript if u wanna check...

http://www.brainbench.com/transcript.jsp?pid=2420160

Wednesday, May 10, 2006

Troubleshooting Orphaned Users

Although the terms login and user are often used interchangeably, they are very different.

  • A login is used for user authentication
  • A database user account is used for database access and permissions validation.

Logins are associated to users by the security identifier (SID). A login is required for access to the SQL Server server. The process of verifying that a particular login is valid is called "authentication". This login must be associated to a SQL Server database user. You use the user account to control activities performed in the database. If no user account exists in a database for a specific login, the user that is using that login cannot access the database even though the user may be able to connect to SQL Server. The single exception to this situation is when the database contains the "guest" user account. A login that does not have an associated user account is mapped to the guest user. Conversely, if a database user exists but there is no login associated, the user is not able to log into SQL Server server.
When a database is restored to a different server it contains a set of users and permissions but there may not be any corresponding logins or the logins may not be associated with the same users. This condition is known as having "orphaned users."

When you restore a database backup to another server, you may experience a problem with orphaned users. The following scenario illustrates the problem and shows how to resolve it.

Use mastersp_addlogin 'test', 'password', 'Northwind'

SELECT sid FROM dbo.sysxlogins WHERE name = 'test'

Grant access to the user you just created

Use Northwind

sp_grantdbaccess 'test'

SELECT sid FROM dbo.sysusers WHERE name = 'test'0xE5EFF2DB1688C246855B013148882E75
As you can see, both SID's are identical.


Backup the database


Use masterBACKUP DATABASE NorthwindTO DISK = 'C:\Northwind.bak'
Copy the Backupfile to another Maschine and SQL Server and restore it as follows:
RESTORE FILELISTONLYFROM DISK = 'C:\Users\Zahn\Work\Northwind.bak'NorthwindNorthwind_log
RESTORE DATABASE TestDBFROM DISK = 'C:\Users\Zahn\Work\Northwind.bak'WITH MOVE 'Northwind' TO 'D:\DataMSSQL\Data\northwnd.mdf', MOVE 'Northwind_log' TO 'D:\DataMSSQL\Data\northwnd.ldf'

The restored database contains a user named "test" without a corresponding login, which results in "test" being orphaned.
Check the SID's
Use masterSELECT sid FROM dbo.sysxlogins WHERE name = 'test'0x39EE98D37EAC2243B7833705EC1C60E3
Use TestDB

SELECT sid FROM dbo.sysusers WHERE name = 'test'0xE5EFF2DB1688C246855B013148882E75

Now, to detect orphaned users, run this code
Use TestDB

sp_change_users_login 'report'

test 0xE5EFF2DB1688C246855B013148882E75


The output lists all the logins, which have a mismatch between the entries in the sysusers system table, of the TestDB database, and the sysxlogins system table in the master database.

Resolve Orphaned Users
Use TestDB

sp_change_users_login 'update_one', 'test', 'test'


SELECT sid FROM dbo.sysusers WHERE name = 'test'0x39EE98D37EAC2243B7833705EC1C60E3

use master

SELECT sid FROM dbo.sysxlogins WHERE name = 'test'0x39EE98D37EAC2243B7833705EC1C60E3

This relinks the server login "test" with the the TestDB database user "test". The sp_change_users_login stored procedure can also perform an update of all orphaned users with the "auto_fix" parameter but this is not recommended because SQL Server attempts to match logins and users by name. For most cases this works; however, if the wrong login is associated with a user, a user may have incorrect permissions.

More Details can be found here

SQL Server2005 Training Free by Microsoft!!

Hi All,

For all those who wants to start in SQL Server 2005 can view microsoft link for SQL Server 2005 Training.

U can find it here:
http://www.microsoft.com/events/series/msdnsqlserver2005.mspx

Lots of webcast and training material is provided.

So visit and have fun .........

Learn and learn u will earn more..

Monday, May 08, 2006

My Interview Experience!!

Hi All ,
On last Saturday I went to take interview for SQL Server Programmers.

I found lots of people were not aware of basic questions of SQL Server.

Some knew SQL as such but not much in SQL Server......Some were mixing there experience of oracle with SQL Server.
Some were answering the questions like an essay etc.
Most of them were not able to clear the basic SQL questions.

Solution:
Try to find a good book and follow it.
Remember all u'r basics clearly.
Try to give short answers unless specified.
Good communication skill is always desired.
Be cool and calm at the place of interview.
Dont take any question granted and if dont know the answer dont try to make on your own.
Improve the quality as such not the quantity.

Some of the links below have questions commonly asked in the interviews:
http://vyaskn.tripod.com/iq.htm
http://www.allapplabs.com/interview_questions/
db_interview_questions.htm
http://www.techinterviews.com/?p=161
http://www.geekinterview.com/Interview-Questions/Oracle/SQL

Hope U will find good questions on SQL Server......U can mail me at .....
shekhar.saxena@programmer.net

Friday, May 05, 2006

Brainbench Certification !!!

Hey Guys ....
I cleared BB Certification for RDBMS just like that....






View my transcript here:
http://www.brainbench.com/transcript.jsp?pid=2420160

Wednesday, May 03, 2006

ANSI And TSQL way of incrementing row no.!!

Hi someone today asked me to add column with autoincrementing values each time row is inserted. So i gave them this solution.
In TSQL we can achive it by :

Create customers_TSQL ( cust_id int IDENTITY (1,1) Primary Key, cust_name varchar(20))

Insert into customers_TSQL values('John')

This will insert a customer named John with cust_id set to 1.Each time a new customer is added cust_id is incremented by 1.
As we can see in create statement we have used IDENTITY (1,1) which says the IDENTITY property is what makes the column increment by 1 when a new row is added. The (1,1) after IDENTITY indicates the seed and increment value can be set when the table is created. If you wanted to have an initial value of 10 and increment it by 5 each time a record is added, you would use (10,5). You can also omit the (seed, increment value) part of the IDENTITY property, in which case the values default to one and one.

While in ANSI what we can do is:

Create customers_ANSI ( cust_id int Primary Key, cust_name varchar(20))

Insert into Customers_ANSI
Select COALESCE(MAX(cust_ID)+1,1),'John' FROM Customers_ANSI

Every time the above statement is executed it will insert value 1 greater than the max value of cust_id.

See COALESCE() in online help for further info.

Tuesday, May 02, 2006

UseFull Tips for Stored Procedure !!

  • SET NOCOUNT ON should be the first statement in every stored procedure you create because it minimizes network traffic between SQL Server and client applications.

    Setting NOCOUNT on disables DONE_IN_PROC messages—the messages SQL Server normally sends to the client indicating the number of rows affected by a T-SQL statement.
    Because these messages are very rarely used, eliminating them conserves network bandwidth without really giving up any functionality and can speed up applications considerably. Note that you can disable DONE_IN_PROC messages for the entire server via a trace flag (3640) and for a particular user session via the sp_configure 'user options' command. (In rare circumstances, disabling DONE_IN_PROC messages can cause problems with some applications—for example, some older versions of Microsoft Access and certain ill-behaved OLEDB providers).
  • A procedure can contain any valid Transact-SQL command except these: CREATE DEFAULT, CREATE FUNCTION, CREATE PROC, CREATE RULE, CREATE SCHEMA, CREATE TRIGGER, CREATE VIEW, SET SHOWPLAN_TEXT, and SET SHOWPLAN_ALL. These commands must reside in their own command batches, and, therefore, can't be part of a stored procedure. Procedures can create databases, tables, and indexes, but not other procedures, defaults, functions, rules, schemas, triggers, or views.You can work around this limitation—the inability to construct most other kinds of objects from within a stored procedure—by constructing a T-SQL string and executing it via sp_executesql or the EXEC() function like :

SET @sql=N'create proc dbo.test2 as select ''1'''EXEC dbo.sp_executesql @sql

  • Although executing a stored procedure can be as easy as listing it on a line by itself in a T-SQL command batch, you should make a habit of prefixing all stored procedure calls with the EXEC keyword, like this: EXEC dbo.sp_who
  • You should also be sure to owner-qualify procedure calls ("dbo" in the previous example). Omitting the owner from a procedure call causes SQL Server to momentarily place a compile lock on the procedure because it cannot locate it immediately in the procedure cache. This lock is released once the procedure-sans-owner is located in the cache, but can still cause problems in high-throughput environments. Owner-qualifying objects is simply a good habit to get into. It's one of those things you can do to save yourself problems down the road.

Some of the Stored Procedure Advantages !!

Stored Procedures have a number of advantages over ad hoc queries, including :-

  • Execution plan retention and reuse
  • Query autoparameterization
  • Encapsulation of business rules and policies
  • Application modularization
  • Sharing of application logic between applications
  • Access to database objects that is both secure and uniform
  • Consistent, safe data modification
  • Network bandwidth conservation
  • Support for automatic execution at system start-up

Thursday, April 27, 2006

Understanding SQL Server 2000 Locking

Introduction
In this article, I want to tell you about SQL Server 2000 lock modes. SQL Server 2000 supports the following lock modes:

Shared (S)
Update (U)
Exclusive (X)
Intent
intent shared (IS)
intent exclusive (IX)
shared with intent exclusive (SIX)
intent update (IU)
update intent exclusive (UIX)
shared intent update (SIU)
Schema
schema modification (Sch-M)
schema stability (Sch-S)
Bulk Update (BU)
Key-Range
Shared Key-Range and Shared Resource lock (RangeS_S)
Shared Key-Range and Update Resource lock (RangeS_U)
Insert Key-Range and Null Resource lock (RangeI_N)
Exclusive Key-Range and Exclusive Resource lock (RangeX_X)
Conversion Locks (RangeI_S, RangeI_U, RangeI_X, RangeX_S, RangeX_U)
Shared locks
Shared (S) locks are used for operations that read data, such as a SELECT statement. During Shared (S) locks use, concurrent transactions can read (SELECT) a resource, but cannot modify the data while Shared (S) locks exist on the resource. If you do not use the HOLDLOCK locking hint and your transaction isolation level is not set to REPEATABLE READ or SERIALIZABLE, the Shared (S) locks on a resource are released as soon as the data has been read. If you use the HOLDLOCK locking hint or your transaction isolation level is set to REPEATABLE READ or SERIALIZABLE, the Shared (S) locks on a resource will be held until the end of the transaction.

By the way, when you select a database in the Enterprise Manager and then click Tables, the Shared (S) lock will be placed on this database, but you can insert/delete/update rows in the tables in this database.

Update locks
Update (U) locks are used when SQL Server intends to modify a row or page, and later promotes the update page lock to an exclusive lock before actually making the changes. The Update (U) locks are used to prevent a deadlock. For example, if two transactions intend to update the same row, each of these transactions will set the shared lock on this resource and then try to set the exclusive lock. Without Update (U) locks, each transaction will wait for the other transaction to release its shared-mode lock, and a deadlock will occur.

To prevent a potential deadlock, the first transaction that tries to update the row will set the Update (U) lock on this row. Because only one transaction can obtain an Update (U) lock to a resource at a time, the second transaction will wait until the first transaction converts the update lock to an exclusive lock and releases the locked resource.

Exclusive locks
Exclusive (X) locks are used for data modification operations, such as UPDATE, INSERT, or DELETE.

Other transactions cannot read or modify data locked with an Exclusive (X) lock. If a Shared (S) exists, other transactions cannot acquire an Exclusive (X) lock.

Intent locks
Intent locks are used when SQL Server wants to acquire a shared lock or exclusive lock on some of the resources lower down in the hierarchy.

Intent locks include:
intent shared (IS)
intent exclusive (IX)
shared with intent exclusive (SIX)
intent update (IU)
update intent exclusive (UIX)
shared intent update (SIU)
Intent shared (IS) locks are used to indicate the intention of a transaction to read some resources lower in the hierarchy by placing Shared (S) locks on those individual resources.

Intent exclusive (IX) locks are used to indicate the intention of a transaction to modify some resources lower in the hierarchy by placing Exclusive (X) locks on those individual resources.

Shared with intent exclusive (SIX) locks are used to indicate the intention of the transaction to read all of the resources lower in the hierarchy and modify some resources lower in the hierarchy by placing Intent exclusive (IX) locks on those individual resources.

Intent update (IU) locks are used to indicate the intention to place Update (U) locks on some subordinate resource in the lock hierarchy.

Update intent exclusive (UIX) locks are used to indicate an Update (U) lock hold on a resource with the intent of acquiring Exclusive (X) locks on subordinate resources in the lock hierarchy.

Shared intent update (SIU) locks are used to indicate shared access to a resource with the intent of acquiring Update (U) locks on subordinate resources in the lock hierarchy.

Schema locks
Schema locks are used when an operation dependent on the schema of a table is executing.

Schema locks include:
schema modification (Sch-M)
schema stability (Sch-S)
Schema modification (Sch-M) locks are used when a table data definition language (DDL) operation is being performed.

Schema stability (Sch-S) locks are used when compiling queries. This lock does not block any transactional locks, but when the Schema stability (Sch-S) lock is used, the DDL operations cannot be performed on the table.

Bulk Update locks
Bulk Update (BU) locks are used during bulk copying of data into a table when one of the following conditions exist:

TABLOCK hint is specified
table lock on bulk load table option is set using sp_tableoption
The bulk update table-level lock allows processes to bulk copy data concurrently into the same table while preventing other processes that are not bulk copying data from accessing the table.

Key-Range locks
Key-Range locks are used by SQL Server to prevent phantom insertions or deletions into a set of records accessed by a transaction. Key-Range locks are used on behalf of transactions operating at the serializable isolation level.

Shared Key-Range and Shared Resource (RangeS_S) locks are used to indicate a serializable range scan.

Shared Key-Range and Update Resource (RangeS_U) locks are used to indicate a serializable update scan.

Insert Key-Range and Null Resource (RangeI_N) locks are used to test ranges before inserting a new key into an index.

Exclusive Key-Range and Exclusive Resource (RangeX_X) locks are used when updating a key in a range.

There are also Key-Range conversion locks. Key-Range conversion locks include:

RangeI_S
RangeI_U
RangeI_X
RangeX_S
RangeX_U
Key-Range conversion locks are created when a Key-Range lock overlaps another lock.

RangeI_S locks are used when RangeI_N lock overlap Shared (S) lock.
RangeI_U locks are used when RangeI_N lock overlap Update (U) lock.
RangeI_X locks are used when RangeI_N lock overlap Exclusive (X) lock.
RangeX_S locks are used when RangeI_N lock overlap RangeS_S lock.
RangeX_U locks are used when RangeI_N lock overlap RangeS_U lock.

Key-Range conversion locks are rarely used and can be observed for a short period of time under complex circumstances.

Lock Modes Compatibility
Because IU, UIX and SIU are undocumented Intent locks and Key-Range conversion locks are rarely used and can be observed for a short period of time under complex circumstances, the Lock Modes Compatibility table does not contain these lock modes.

Transaction Log Guidelines

Every SQL Server database has at least two files associated with it: one data file that houses the actual data and one transaction log file. The transaction log is a fundamental component of a database management system. All changes to application data in the database are recorded serially in the transaction log. Using this information, the DBMS can track which transaction made which changes to SQL Server data.
The Basics of LoggingThe CREATE DATABASE statement is used to create a Microsoft SQL Server database. The LOG ON clause is used to identify the transaction log file for the database to be created. Once created, data is stored in the database file, and a record of the modifications to that data is recorded in the transaction log file.
As database modifications are made, the transaction log for the database will grow. Since most database changes are logged, you will need to actively monitor the size of the transaction log files because if the data is constantly changing, the log will be continuously growing.
At each checkpoint, Microsoft SQL Server will guarantee that all log records and all modified database pages are written safely to disk. Transaction log files are used by Microsoft SQL Server during database recovery operations to commit completed transactions and rollback uncompleted transactions. Information recorded on the transaction log includes:
the beginning time of each transaction
the actual changes made to the data and enough information to undo the modifications made during each transaction (accomplished using before and after images of the data)
the allocation and deallocation of database pages
the actual commit or rollback of each transaction
Using this data, Microsoft SQL Server can accomplish data integrity operations to ensure consistent data is maintained in the database. The transaction log is used when SQL Server is restarted, when transactions are rolled back, and to restore the database to the state prior to the transaction.
When SQL Server is restarted, each database goes through a recovery process. SQL Server checks to determine which transactions must be rolled forward. This occurs for transactions where it is unknown if all the modifications were actually written from the cache to disk. Because a checkpoint forces all modified pages to disk, it represents the point at which the startup recovery must start rolling transactions forward. Because all pages modified before the checkpoint are guaranteed to be on disk, there is no need to roll forward anything done before the checkpoint.
When a transaction is rolled back SQL Server copies before images to the database for every modification made since the BEGIN TRANSACTION.
During a recovery scenario you can use the transaction log to restore a database. This causes a roll forward of the transaction log. During a roll forward SQL Server will copy after images of each modification to the database. Using the logged data SQL Server ensures that each modification is applied in the same order that it originally occurred.
You can see where the transaction log is a useful item to have around in case of database errors, transaction errors, and to ensure data integrity.
Some Operations Are Not Always LoggedMicrosoft SQL Server avoids logging in certain situations to avoid “out of space” conditions caused by rapid growth in transaction log files.
For some large operations, such as CREATE INDEX, Microsoft SQL Server will not log every new page. Instead, SQL Server records enough information to determine that a CREATE INDEX happened, so that it can either be recreated for a roll forward, or removed during a roll back.
Additionally, if the select into/bulkcopy database option is set to TRUE, Microsoft SQL Server will not record the following operations in the transaction log: bulk load operations, SELECT INTO statements, and WRITETEXT and UPDATETEXT statements. Because these operations are not recorded in the transaction log, SQL Server can not use the restore operation on the transaction log to recover these operations. Since SQL Server has no knowledge of the operations occurring, it can not recover the data.
If one of these operations occurs when the select into/bulkcopy option is TRUE, be sure to backup the database so that the changes made be these operations are preserved if a subsequent restore is required.
Transaction Log BackupsTo ensure an efficient and effective backup and recovery strategy for your Microsoft SQL Server databases, you will need to implement a periodic transaction log backups. A transaction log backup is created using the BACKUP LOG command. A database can be restored to any point in time contained within the sequence of transaction logs you have backed up, up to the point of failure. If you do not backup your transaction logs before truncating them, you will only be able to restore your database to the last database backup you have created.
When Microsoft SQL Server finishes backing up the transaction log, it truncates the inactive portion of the transaction log. This frees up space on the transaction log. SQL Server can reuse this truncated space instead of causing the transaction log to continuously grow and consume more space. The active portion of the transaction log contains transactions that are still running and have not completed yet.
Microsoft SQL Server will try to take a checkpoint whenever the transaction log becomes 70 percent full, if a log full error occurs, and when SQL Server is shut down (unless using the NOWAIT option) it will take a checkpoint for each database.
The transaction log should not be backed up if the truncate log on checkpoint database option is set to TRUE. If you specify truncate log on checkpoint to be true, Microsoft SQL Server will clear out inactive transaction log entries at every checkpoint. This option essentially tells SQL Server that you will not be using the transaction log for restore operations. The transaction log is still required to roll back transactions and for SQL Server to determine how to recover databases when it is restarted. Use this option only for systems where it is okay for you to lose transactions during the course of a day, because you will only be able to restore your database back to the last backup that was taken. Applications of this nature are very rare in most production environments.
If a transaction log has been truncated (except by a BACKUP LOG) you should not backup that log until you take a database backup or differential database backup. A differential database backup will backup only data that has changed since the last full database backup.
You should also avoid backing up transaction logs any nonlogged operations have occurred in since the last database backup was created. Create a database or differential database backup instead.
And finally, if any files are added or deleted from the database, a transaction log backup should not be taken. Instead, you should create a full database backup.
Changing Database OptionsThe truncate log on checkpoint option can be changed at the database level. Use the system procedure named sp_dboption to change the configuration settings for a database. For example:
exec sp_dboption 'pubs', 'trunc. log on chkpt.', 'false'This will change the truncate log on checkpoint option to false for the pubs database. To see a list of all current database options set for a database, simply issue the system procedure without additional parameters, for example:
exec sp_dboption pubsYou can also use Enterprise Manager to set database options. When a database is first created, most of its options will be set to false. In the desktop edition of Microsoft SQL Server, however, the truncate log on checkpoint database option is set to true. This may not be a problem depending on the recovery requirements of the data stored in your desktop SQL Server databases.
You might also consider setting truncate log on checkpoint to TRUE during an application development cycle. Usually, it is not important to save every test transaction that is attempted while an application is in development.
ConclusionThis has been a brief introduction to using transaction logs with Microsoft SQL Server. The topic of database backup and recovery is complex and we have only touched the surface of how SQL Server implements backup and recovery.
The main lesson to digest from this article is the importance of the transaction log. Too many times new SQL Server databases are implemented with very small transaction logs coupled with using the truncate log on checkpoint option. This is a dangerous combination for mission-critical production applications because it can cause transactions to be lost during a hardware, software, or application failure. And every transaction is precious in most production systems. Make sure your SQL Server databases are protected by planning for and implementing a transaction log backup and recovery plan.

Backing Up and Restoring SQL Server Databases and Transaction Logs

General Concepts
The backing up and restoring of databases and transaction logs is a way that SQL Server provides to protect from data loss. The backup operation creates a copy of a database. This copy can be used to restore the database if media failure occurs or if the database is somehow damaged (for example, from user errors).

SQL Server 2000 supports the following kinds of backup:

Full database backup (it is a full copy of the database).
Transaction log backup (it contains a copy of the transaction log only).
Differential backup (it contains a copy of only the database pages modified since the last full database backup).
SQL Server 2000 supports three recovery models which determine how your data is backed up and what your exposure to data loss is. There are:

Simple recovery model
Full recovery model
Bulk-Logged recovery model

Simple Recovery modelThe Simple Recovery model allows the database to be recovered to the point of the last backup (to the most recent backup). With this recovery model, you cannot restore the database to the point of failure or to a specific point in time. Therefore, changes since the last backup will be lost.

The backup strategy for this recovery model consists of the full database backups only or full database backups and some differential backups. To recover a database, you should restore the most recent full database backup and then restore the most recent differential backup if it exists. The Simple Recovery model is similar to setting the trunc. log on chkpt database option in SQL Server 7.0 or earlier. This recovery model takes less time to perform the backup and restore in comparison with the Full and Bulk-Logged models and requires less disk space, but does not provide the opportunity to restore the database to the point of failure or to a specific point in time.

For example, if you make a full database backup of the pubs database on a Sunday at 1 AM, and make a differential backup of the pubs database on Monday at 1 AM and again on Tuesday at 1 AM, then suppose that the pubs database is damaged on Tuesday at 3 AM, you should restore the full database backup from Sunday 1 AM, and then restore the differential backup from Tuesday 1 AM. All changes since Tuesday 1 AM will be lost.

-- Create a full database backup of the pubs database on Sunday at 1AM
BACKUP DATABASE pubs TO pubs_back WITH INIT
GO
-- Time elapses.
-- Create a differential backup of the pubs database on Monday at 1AM
BACKUP DATABASE pubs TO pubs_back WITH DIFFERENTIAL
GO
-- Time elapses.
-- Create a differential backup of the pubs database on Tuesday at 1AM
BACKUP DATABASE pubs TO pubs_back WITH DIFFERENTIAL
GO

-- Time elapses.
-- The pubs database was damaged on Tuesday at 3 AM
-- Restore the full database from Sunday 1 AM without recovering
RESTORE DATABASE pubs FROM pubs_back WITH NORECOVERY
GO
-- Restore the differential backup from Tuesday 1 AM with recovering
RESTORE DATABASE pubs FROM pubs_back WITH FILE = 3, RECOVERY
GO

Full Recovery modelThe Full Recovery model allows the database to be recovered to the point of failure or to a specific point in time. This recovery model provides the best protection of your data, but requires more time to make the backup and restore. With the Full Recovery model, all operations, including bulk operations such as SELECT INTO, CREATE INDEX, and bulk loading data, are fully logged. The backup strategy for this recovery model consists of the full database backups (maybe with some differential backups) and transaction log backups. If the current transaction log file for the database is available and undamaged, you should perform the following steps to restore the database to the point of failure:

Back up the active transaction log. Restore the most recent full database backup without recovering the database.
Restore the most recent differential backups, if it exists.
Restore each transaction log backup created since the most recent full database backup (or the most recent differential backups, if differential backups exist) in the same sequence in which they were created without recovering the database.
Apply the transaction log backup created in Step 1, and recover the database.
For example, to perform the Full Recovery model for the pubs database at the following backup intervals:

Make the full database backup of the pubs database on Sunday at 1 AM.
Make the differential backups of the pubs database on Monday at 1 AM and on Tuesday at 1 AM.
Make the transaction log backups of the pubs database on Monday at 2 AM and on Tuesday at 2 AM.
So, if the primary data file of the pubs database was damaged on Tuesday at 3 AM, you should recover the pubs database by using the following steps:

Back up the active transaction log.
Restore the full database backup from Sunday 1 AM.
Restore the differential backup from Tuesday 1 AM.
Restore the transaction log from Tuesday 2 AM.
Apply the transaction log backup created in Step 1, and recover the database.
-- Create a full database backup of the pubs database on Sunday at 1AM
BACKUP DATABASE pubs TO pubs_back_db WITH INIT
GO
-- Time elapses.
-- Create a differential backup of the pubs database on Monday at 1AM
BACKUP DATABASE pubs TO pubs_back_db WITH DIFFERENTIAL
GO
-- Time elapses.
-- Create a transaction log backup of the pubs database
-- on Monday at 2AM
BACKUP LOG pubs TO pubs_back_log WITH INIT
GO
-- Time elapses.
-- Create a differential backup of the pubs database on Tuesday at 1AM
BACKUP DATABASE pubs TO pubs_back_db WITH DIFFERENTIAL
GO
-- Time elapses.
-- Create a transaction log backup of the pubs database
-- on Tuesday at 2AM
BACKUP LOG pubs TO pubs_back_log
GO

-- Time elapses.
-- The primary data file of the pubs database was damaged
-- on Tuesday at 3 AM
-- Back up the active transaction log.
BACKUP LOG pubs TO pubs_back_log
GO
-- Restore the full database from Sunday 1 AM without recovering
RESTORE DATABASE pubs FROM pubs_back_db WITH NORECOVERY
GO
-- Restore the differential backup from Tuesday 1AM without recovering
RESTORE DATABASE pubs FROM pubs_back_db WITH FILE = 3, NORECOVERY
GO
-- Restore the transaction log backup from Tuesday 2AM with recovering
RESTORE LOG pubs FROM pubs_back_log WITH FILE = 2, NORECOVERY
GO
-- Apply the last created log backup and recover the database
RESTORE LOG pubs FROM pubs_back_log WITH FILE = 3, RECOVERY
GO

Bulk-Logged Recovery modelThe Bulk-Logged Recovery model provides better performance in comparison with the Full Recovery model because the SELECT INTO, Bulk load operations, CREATE INDEX and WRITETEXT, UPDATETEXT operations are minimally logged. Full and Bulk-Logged Recovery models are similar, and you can switch between them easily. The Bulk-Logged Recovery model does not allow the database to be recovered to a specific point in time. Under this recovery model the transaction log backups capture both the log and the results of any bulk operations performed since the last backup, so it is not necessary to perform a full database backup after bulk copy operations complete.

Taken From Database Journal

Wednesday, April 26, 2006

Stored Procedures Part 1

Using Stored Procedures in SQL
Most of us, the database programmers, have used Stored Procedures. May be not all of us knows about why we use them. This article is for those who have used/never used stored procedures, and are yet to understand why everyone suggests using them in your Database.

Stored Procedures – What are they?

Stored procedure is a set of pre-defined Transact-SQL statements, used to perform a specific task. There can be multiple statements in a stored procedure, and all the multiple statements are clubbed in to one database object.

How to create a stored procedure?

Creating a stored procedure is as easy as running the “Create Procedure” statement followed by the SQL script. You can run your Create Procedure statement from the SQL Query Analyzer, or can use the New Procedure menu item in the Enterprise Manager.

The simplest skeleton of a stored procedure.

CREATE PROC procedure_name
[ { @parameter data_type }
]
AS sql_statement

Check the basic building blocks of a stored procedure.

A stored procedure includes

1. A CREATE PROC (CREATE PROCEDURE) statement;

2. The procedure name;

3. The parameter list

4. And the SQL statements.
Even though there are numerous other options available while we define a stored procedure, I kept it simple, just to give you a basic idea about creating stored procedures.

Advantages!

Almost every database Guru that you will meet, will suggest using stored procedures. For you, it will seem as if most of them blindly believes in stored procedures. But there are reasons for this. This is what I am trying to explore in this article.

1. Performance

All the SQL statements, that you send to your database server passes through a series of actions, called execution. These are the steps that your SQL statement passes through before the data is returned to the client.

User sends request to execute the Stored Procedure. SQL Server checks for syntax errors. Identifies and checks the aliases in the FROM clause. Creates a query plan. Compiles the query and. Executes the query plan and return the requested data.
See, lots of things are happening inside that we didn’t knew about. Now, the crucial question. Does a stored procedure bypass all these?

In a way, yes. The previous versions of SQL Server stored the compiled execution plan in system tables, making them partially pre-compiled. This improved performance, because the Server did not have to compile the stored procedure each and every time it is called.

In later versions of SQL Server, there were a large number of changes in statement processing. Now, the stored procedure is stored in a procedure cache when it is called, making subsequent calls faster.

2. Security

Stored procedures provide significant benefits when it comes to security. By using a stored procedure, you can grant permissions to certain users to access data, reducing the immense coding that you need to do in your client applications. This is one of the best ways to control access to your data.

3. Modifications/Maintenance

If you use stored procedures for database access, any change in the database can be reflected on to the client application without much effort. This is because you know exactly where the data is accessed from, and you also know exactly where you need to alter. This means no scuba diving in to thousands of lines of source code to identify areas where you need to alter and no headache of re-deploying the client application.

4. Minimal processing at the client.

When creating a client/server application, normally it was the client who took care of the integrity of data that went in to the database. Managing Primary Keys, Foreign keys, cascaded deletion everything was done by the client, and the database server just had to store data given by the client.

Well friends, things have changed. Stored procedures help you write batch of SQL statements, which helps you manage the transactions, constraints etc. A little data aware code has to be written in to the client application, making it a thin-client application. These applications will be concerned more about displaying data in the way the user needs them and they know little about the database.

Take another scenario. You have a database with millions of rows and hundreds of tables. You need to do some calculations before updating each and every record. If you are fetching the complete data to the client, and is asking the client machine to process the data completely, then think about the overhead it creates. But when the client can execute a store procedure, where you have done the calculations prior to updating the records, you have a client, that doesn’t need to know about the calculations. This also reduces the amount of computing happening in the client, and the server takes care of tedious calculations.

5. Network traffic

Client applications always have to request/send data from the database server. These data are sent as packets, and travel through the network to the server.

To explain how stored procedures can help reduce network traffic, let us see another scenario, where a request for data is send from the client. The request is sent as an SQL statement, and here it is.

SELECT dbo.Tbl_Tablename.fieldID,
dbo.Tbl_Tablename.fieldName,
dbo.Tbl_Tablename.Title,
dbo.TBl_otherTableName.fieldID,
dbo.Tbl_Tablename.Published,
dbo.Tbl_Tablename.Updated,
dbo.Tbl_Tablename.SomeText,
dbo.Tbl_Tablename.TransactionDate,
dbo.Tbl_Tablename.Approved,
dbo.Tbl_Tablename.ApprovedBy,
dbo.Tbl_Tablename.ApprovalID
FROM
dbo.Tbl_Tablename
LEFT OUTER JOIN
dbo.TBl_otherTableName on dbo.Tbl_Tablename.fieldID=dbo.TBl_otherTableName.ID
Where
DateDiff ( wk, dbo.Tbl_Tablename.TransactionDate, getdate()) <= 1
and dbo.Tbl_Tablename.Approved = 0

518 Characters travel through the network, and when there are 20 client applications using this stored procedure 20 times a day, the number of characters passing through the network for just this request will be 2,07,200!

You see the difference now. If it was a stored procedure, lets call it SP_fetchSomething, there are only 6800 characters in the network for the request. A saving of 2,004,00!

As you have seen the five major points that I use to explain why I used a stored procedure, I hope you will also elect to intelligently use this awesome technology in your next database design.

Tuesday, April 25, 2006

Intro to User Defined Functions(Nice One)

Introduction
The ability to create a user-defined function (UDF) is a new feature added to SQL Server 2000. Developers have been asking Microsoft to add this feature for several versions of the product, so let's take a quick look at how to create a few simple UDFs to see if they can help you in your programming endeavors.

Creating a Simple UDF
A user-defined function is a database object that encapsulates one or more Transact-SQL statements for reuse. This definition is similar to the one for stored procedures, but there are many important differences between user-defined functions and stored procedures?the most pronounced being what types of data they can return. Let?s create one so you can see how easy they are to create and reference.

The following statement shows how to create a function that accepts two input parameters, sums them together and then returns the sum to the calling statement.

CREATE FUNCTION fx_SumTwoValues
( @Val1 int, @Val2 int )
RETURNS int
AS
BEGIN
RETURN (@Val1+@Val2)
END
The structure of the CREATE FUNCTION statement is fairly straightforward. You provide an object name (fx_SumTwoValues), input parameters (@Val1 and @Val2), the type of data the function will return () and the statement(s) the function executes are located between the BEGIN?END block. The following SELECT statement calls the function. Note that the two-part name (owner.object_name) is required when calling this function.

SELECT dbo.fx_SumTwoValues(1,2) AS SumOfTwoValues

SumOfTwoValues
--------------
3
When the SELECT is executed, the input parameters 1 and 2 are added together and the sum 3 is returned. You can use any values that either are, or can be, implicitly converted to an int data type for the input parameters. Keep in mind, though, that only an int can be returned, so the following statement will not produce the desired results.

SELECT dbo.fx_SumTwoValues(1.98,2.78) AS SumOfTwoValues

SumOfTwoValues
--------------
3
The function returns a 3, which indicates the decimal portion of the parameters are truncated before the calculation occurs.

SQL Server?s ability to implicitly convert data allows the following to execute successfully.

SELECT dbo.fx_SumTwoValues('7','7') AS SumOfTwoValues

SumOfTwoValues
--------------
14
When values that cannot be converted to an int are passed to the function, the following error message is generated.

SELECT dbo.fx_SumTwoValues('Y','7') AS SumOfTwoValues

Server:Msg 245,Level 16,State 1,Line 1
Syntax error converting the varchar value 'Y'to a column of data type int.
Three Types of User-Defined Functions
Now that you have seen how easy it is to create and implement a simple function, let?s cover the three different types of user-defined functions and some of the nuances of how they are implemented.

Scalar Functions
A scalar function returns a single value of the data type referenced in the RETURNS clause of the CREATE FUNCTION statement. The returned data can be of any type except text, ntext, image, cursor, or timestamp.

The example we covered in the previous section is a scalar function. Although the previous example only contained one statement in the BEGIN?END block, a scalar function can contain an unlimited number of statements as long as only one value is returned. The following example uses a WHILE construct to demonstrate this.

CREATE FUNCTION fx_SumTwoValues2
( @Val1 int, @Val2 int )
RETURNS int
AS
BEGIN
WHILE @Val1 <100
BEGIN
SET @Val1 =@Val1 +1
END
RETURN (@Val1+@Val2)
END
go

SELECT dbo.fx_SumTwoValues2(1,7) AS SumOfTwoValues

SumOfTwoValues
--------------
107
The @Val1 input parameter is set to 1 when the function is called, but the WHILE increments the parameter to 100 before the RETURN statement is executed. Note that the two-part name (owner.object_name) is used to call the function. Scalar functions require that their two-part names be used when they are called. As you will see in the next two sections, this is not the case with the other two types of functions.

Inline Table-Valued Functions
An inline table-valued function returns a variable of data type table whose value is derived from a single SELECT statement. Since the return value is derived from the SELECT statement, there is no BEGIN/END block needed in the CREATE FUNCTION statement. There is also no need to specify the table variable name (or column definitions for the table variable) because the structure of the returned value is generated from the columns that compose the SELECT statement. Because the results are a function of the columns referenced in the SELECT, no duplicate column names are allowed and all derived columns must have an associated alias.

The following uses the Customer table in the Northwind database to show how an inline table-valued function is implemented.

USE Northwind
go
CREATE FUNCTION fx_Customers_ByCity
( @City nvarchar(15) )
RETURNS table
AS
RETURN (
SELECT CompanyName
FROM Customers
WHERE City =@City
)
go
SELECT * FROM fx_Customers_ByCity('London')

CompanyName
----------------------------------------
Around the Horn
. . .
Seven Seas Imports
Multi-Statement Table-Valued Functions
The multi-statement table-valued function is slightly more complicated than the other two types of functions because it uses multiple statements to build the table that is returned to the calling statement. Unlike the inline table-valued function, a table variable must be explicitly declared and defined. The following example shows how to implement a multi-statement table-valued function that populates and returns a table variable.

USE Northwind
go
CREATE FUNCTION fx_OrdersByDateRangeAndCount
( @OrderDateStart smalldatetime,
@OrderDateEnd smalldatetime,
@OrderCount smallint )
RETURNS @OrdersByDateRange TABLE
( CustomerID nchar(5),
CompanyName nvarchar(40),
OrderCount smallint,
Ranking char(1) )
AS
BEGIN
--Statement 1
INSERT @OrdersByDateRange
SELECT a.CustomerID,
a.CompanyName,
COUNT(a.CustomerID)AS OrderCount,
'B'
FROM Customers a
JOIN Orders b ON a.CustomerID =b.CustomerID
WHERE OrderDate BETWEEN @OrderDateStart AND @OrderDateEnd
GROUP BY a.CustomerID,a.CompanyName
HAVING COUNT(a.CustomerID)>@OrderCount

--Statement 2
UPDATE @OrdersByDateRange
SET Ranking ='A'
WHERE CustomerID IN (SELECT TOP 5 WITH TIES CustomerID
FROM (SELECT a.CustomerID,
COUNT(a.CustomerID)AS OrderTotal
FROM Customers a
JOIN Orders b ON a.CustomerID =b.CustomerID
GROUP BY a.CustomerID) AS DerivedTable
ORDER BY OrderTotal DESC)
RETURN
END
The main difference between this example and the one in the previous section is that we were required to specify the structure of the @OrdersByDateRange table variable used to hold the resultset and list @OrdersByDateRange in the RETURNS clause. As you can see from the input parameter list, the function accepts a start date, an end date and an order count value to filter the resultset.

The first statement (--Statement 1) uses the input parameters to populate the table variable with customers who meet the specified criteria. The second statement (-Statement 2) updates the rows in table variable to identify the top five overall order placers. The IN portion of the UPDATE may seem a little confusing at first glance, but all its doing is using a derived table to select the CustomerID values of the top five order producers. Derived tables are discussed in Chapter 4. You can use the following to find the companies who have submitted more than two orders between 1/1/96 and 1/1/97.

SELECT *
FROM fx_OrdersByDateRangeAndCount ('1/1/96','1/1/97',2)
ORDER By Ranking

CustomerID CompanyName OrderCount Ranking
---------- ------------------------------ ---------- -------
ERNSH Ernst Handel 6 A
FOLKO Folk och fä ˆB 3 A
HUNGO Hungry Owl All-Night Grocers 5 A
QUICK QUICK-Stop 6 A
SAVEA Save-a-lot Markets 3 A
SEVES Seven Seas Imports 3 B
SPLIR Split Rail Beer &Ale 5 B
...
The rows ranking values of ?A? indicate the top five order placers of all companies. The function allows you to perform two operations with one object. Retrieve the companies who have placed more than two orders between 1/1/96 and 1/1/97 and let me know if any of these companies are my top five order producers.

One of the advantages of using this type of function over a view is that the body of the function can contain multiple SQL statements to populate the table variable, whereas a view is composed of only one statement. The advantage of using multi-statement table-valued function versus a stored procedure is that the function can be referenced in the FROM clause of a SELECT statement while a stored procedure cannot. Had a stored procedure been used to return the same data, the resultset could only be accessed with the EXECUTE command.

A Real-World Example
Now that you have an idea of the different types of functions available in SQL Server 2000, let's wrap up this article with an example you might be able to use on one of your projects. The following statements create a function that determines the last day of the month (LDOM) for a given date parameter.

CREATE FUNCTION fx_LDOM
( @Date varchar(20) )
RETURNS datetime
AS
BEGIN
--ensure valid date
IF ISDATE(@Date) = 1
BEGIN
--determine first day of month
SET @Date = DATEADD(day,-DAY(@Date)+1,@Date)
--determine last day of month
SET @Date = DATEADD(day,-1,DATEADD(month,1,@Date))
END
ELSE
SET @Date = '1/1/80'

RETURN @Date
END
The function's parameter (@Date) is defined as varchar(20), so error-checking code can be implemented. The ISDATE function makes sure the supplied value is a valid date. When an invalid date value is supplied, the function returns '1/1/80' to the calling statement. If you do not use this type of error-checking, the call to the function will fail when an invalid date is supplied.

When a valid date value is supplied, the DATEADD function is used to:

Determine the first day of the month, and
Determine the last day of the month.
If you have never used DATEADD before this may seem a little confusing, but a quick explanation should eliminate any that might exist. You use DATEADD to add or substract a date/time unit from a given date. The first parameter (in this case 'day') indicates the portion of the date that should be incremented. You can also specify year, quarter, month...millisecond. The second parameter is the number of units to add or substract. When subtracting, you simply make the value negative as shown in the example. The third parameter is the date value on which the calculation is performed.

The first day of the month is determined by calculating the number of elapsed days in the supplied parameter with the DAY function, adding 1 and then substracting it from the parameter. For an @Date value of 1/15/01, it simply subtracts 14 (15-1) days to get 1/1/01.

The last day of the month is determined by adding 1 month to the current month value and subtracting one day. So, 1/1/01 plus 1 month is equal to 2/1/01 and when you substract one day you get: 1/31/01.

The following shows how fx_LDOM is used in a SELECT statement to calculate the number of days remaining in a month.

CREATE TABLE fx_Testing (DateValue datetime)
go
INSERT fx_Testing VALUES ('1/1/01')
INSERT fx_Testing VALUES ('2/15/01')
INSERT fx_Testing VALUES ('2/15/02')
INSERT fx_Testing VALUES ('2/15/03')
INSERT fx_Testing VALUES ('2/15/04')

SELECT DateValue,
dbo.fx_LDOM(DateValue) AS LDOM,
DATEDIFF(day,DateValue,dbo.fx_LDOM(DateValue)) AS DaysLeftInMonth
FROM fx_Testing

DateValue LDOM DaysLeftInMonth
------------------------ ----------------------- ---------------
2001-01-01 00:00:00.000 2001-01-31 00:00:00.000 30
2001-02-15 00:00:00.000 2001-02-28 00:00:00.000 13
2002-02-15 00:00:00.000 2002-02-28 00:00:00.000 13
2003-02-15 00:00:00.000 2003-02-28 00:00:00.000 13
2004-02-15 00:00:00.000 2004-02-29 00:00:00.000 14
The DATEDIFF function is used to determine the difference between two dates. In this case, the number of days between the value in the DateValue column and the last day of the month calculated by fx_LDOM.


Taken from SQLTeam

UDF OR SPROC ?

Wednesday, April 19, 2006

How to replace substring in given string!!

Do it like this:-
THIS is to show how a string can be replaced by new text in the column.

UPDATE TABLE_NAME
SET COLUMN_NAME = REPLACE(COLUMN_NAME,'OLDTEXT','NEWTEXT')

Find Nth highest Salary

Its done like:-

SELECT MIN(SALARY) FROM EMPLOYEE
WHERE SALARY IN (SELECT DISTINCT TOP N SALARY FROM EMPLOYEE ORDER BY SALARY DESC)
---------------------------

Using SQL Server's CHARINDEX and PATINDEX

If you have written many applications then you probably have run across situations where you need to identify if a specific character or set of characters appears in a string. In this article I will discuss using the CHARINDEX and PATINDEX functions to search text columns and character strings. I will show you how each of these functions operate, and explain the differences between them. Also provided are some examples on how you might consider using these functions to solve a number of different character search situations.

The CHARINDEX and PATINDEX functions are used to search a character string for a character or set of characters. If the character string being searched contains the characters being searched for, then these functions return a non-zero integer value. This integer value is the starting location of where the character string being searched for is located within the string being searched. The PATINDEX function allows for using wildcard syntax in the pattern being searched for, where as the CHARINDEX function does not support wildcard character searches. Let's look at each one of these functions in a little more detail.

How to use the CHARINDEX Function

The CHARINDEX function returns the starting position of a character, or a string of characters within another character string. The CHARINDEX function is called using the following format:

CHARINDEX ( expression1 , expression2 [ , start_location ] )

Where expression1 is the string of characters to be found in expression2, and start_location is the position where the CHARINDEX function will start looking for expression1 in expression2.

The CHARINDEX function returns an integer. The integer value returned is the position where the characters being search for are located within the string being searched. If the CHARINDEX does not find the characters you are searching for then the function returns a zero integer value. Let say we execute the following CHARINDEX function call:

CHARINDEX('SQL', 'Microsoft SQL Server')
This function call will return the starting location of the character string "SQL", in the string "Microsoft SQL Server". In this case the CHARINDEX function will return the number 11, which as you can see is the starting position of "S" in string "Microsoft SQL Server".

Now say we have the following CHARINDEX Command:

CHARINDEX('7.0', 'Microsoft SQL Server 2000')
In this example the CHARINDEX function will return zero, since the character string "7.0" cannot be found in the string "Microsoft SQL Server 2000". Let go through a couple of examples of how you might be able to use the CHARINDEX function to solve some actual T-SQL problems.

For the first example say you would like to display only the last name of the ContactName column, for the first 5 records in the Northwind database Customer table. Here are the first 5 records.

ContactName
------------------------------
Maria Anders
Ana Trujillo
Antonio Moreno
Thomas Hardy
Christina Berglund

As you can see, the CustomerName contains both the first and last name of the customer, where first and last name are separated by a single space. I will use the CHARINDEX function to identify the position of the space between the two names. This way we can used the position of the space to parse the ContactName so we can display only the last name portion of the column. Here is some T-SQL code to display only the last name for the first 5 records in the Northwind Customer table.

select top 5 substring(ContactName,
charindex(' ',ContactName)+1 ,
len(ContactName)) as [Last Name]
from Northwind.dbo.customers

Here is the output from this command:

Last Name
------------------------------
Anders
Trujillo
Moreno
Hardy
Berglund

The CHARINDEX function found the space between the First and Last Name, so that the substring function could split the ContactName, thus only the Last Name was displayed. I added 1 to the integer value that CHARINDEX returned, so the Last Name displayed did not start with a space.

For the second example, say you want to count all the records from a table where a given column contains a particular character string. The CHARINDEX function could be used to satisfy your request. To count all of the Addresses in the Northwind.dbo.Customer table where the Address column contains either the word 'Road' or an abbreviation for road ('Rd'), your SELECT statement would look like this:

select count(*) from Northwind.dbo.Customers
where CHARINDEX('Rd',Address) > 0 or CHARINDEX('Road',Address)
> 1
How Does the PATINDEX Function Work?

The PATINDEX function returns the starting position of a character or string of characters within another string, or expression. As stated earlier the PATINDEX has additional functionality over CHARINDEX. PATINDEX supports wildcard characters in the search pattern string. This makes PATINDEX valuable for searching for varying string patterns. The PATINDEX command takes the following form:

PATINDEX ( '%pattern%' , expression )

Where "pattern" is the character string you are searching for and expression is the string in which you are searching. Commonly the expression is a column in a table. The "%" sign is needed on the front and back of the pattern, unless you are searching for the pattern at the beginning and/or ending of the expression.

Like the CHARINDEX function, the PATINDEX function returns the starting position of the pattern within the string being searched. If you have a PATINDEX function call like so:

PATINDEX('%BC%','ABCD')

Then the result of the PATINDEX function call is 2, which is the same as the CHARINDEX function. The %'s in the above command tells the PATINDEX command to find the position of the "BC" string where the string might have zero or more characters in front of, or after "BC". The % sign is a wildcard character.

If you want to determine if a string starts with a specific set of characters you would leave off the first % sign, and your PATINDEX call would look like this:

PATINDEX('AB%','ABCD')

In this case the PATINDEX function returns a 1, indicating that the pattern 'AB' was found in the expression 'ABCD'.

Now with wildcard characters you can create a much more complicated pattern matching situation then the simple ones I have shown you so far. Say you want to determine if a character string contains the letters A and Z, as well as any numeric number. Then your PATINDEX function call might look like this.

PATINDEX('%[A,Z,0-9]%[A,Z,0-9]%[A,Z,0-9]%','XYZABC123')

Note that the pattern I am looking for in the above example uses a number of wildcard references. Review SQL Server Books Online for other wildcard syntax information. Let's go through a couple of examples in how we might use the PATINDEX command in conjunction with a SELECT statement.

Say you want to find all of the records that contain the words "Bread", or "bread" in the Description text column in the Northwind.dbo.Categories table, then your select statement would look like this:

select Description from Northwind.dbo.Categories
where patindex('%[b,B]read%',description) > 0
Here I used wildcarding to look for either a lower or upper case "b". When I run this SELECT statement against my Northwind database I get the following Description columns displayed:

Description
--------------------------------------------------------
Desserts, candies, and sweet breads
Breads, crackers, pasta, and cereal

Here is another example where I used some additional wildcard references to find some records. This example excludes the Description that has an 'e' as the second letter from the result set in the above example.

select Description from Northwind.dbo.Categories
where patindex('%[b,B]read%',description) > 0
and patindex('_[^e]%',description) = 1

By adding an additional PATINDEX function call, to the where statement, that used the ^ wildcard symbol, I was able to exclude the "Dessert, candies, and sweet breads" description. The above example returned the following single description.

Description
--------------------------------------------------------
Breads, crackers, pasta, and cereal

Conclusion
As you can see the CHARINDEX and the PATINDEX perform similar kinds of pattern searches within a character string. The PATINDEX function provides wildcard specifications, allowing it to be used for much different kinds of pattern matches, while the CHARINDEX function does not. Depending on what you need to do, these two functions are great in helping you search, manipulate and parse character strings in SQL Server.

Practical Uses of PatIndex() ... Or Why CharIndex() is not enough

PatIndex vs. CharIndex
One question that I am frequently asked by developers new to T-SQL is whether they should use CharIndex() or PatIndex() and what the difference between the two is. SQL Server Books Online explains that the difference between the two is that PatIndex can use wildcard characters. This is often unclear to many developers because they associate the term wildcard with the percent sign (%) only. What Books Online does not make clear is that PatIndex() can make use of the full spectrum of wildcard characters. This gives it power well beyond that of CharIndex().

PatIndex is Like Like
To really see the full capability of PatIndex(), you must take a look at the Like command in SQL Server Books Online. Books Online describes the available wildcards characters as follows:

Wildcard character Description Example
% Any string of zero or more characters. WHERE title LIKE '%computer%' finds all book titles with the word 'computer' anywhere in the book title.
_ (underscore) Any single character. WHERE au_fname LIKE '_ean' finds all four-letter first names that end with ean (Dean, Sean, and so on).
[ ] Any single character within the specified range ([a-f]) or set ([abcdef]). WHERE au_lname LIKE '[C-P]arsen' finds author last names ending with arsen and beginning with any single character between C and P, for example Carsen, Larsen, Karsen, and so on.
[^] Any single character not within the specified range ([^a-f]) or set ([^abcdef]). WHERE au_lname LIKE 'de[^l]%' all author last names beginning with de and where the following letter is not l.


If we adapt this table for PatIndex(), it would look something like this:

Wildcard character Description Example
% Any string of zero or more characters. WHERE PatIndex('%computer%', title) > 0 finds all book titles with the word 'computer' anywhere in the book title.
_ (underscore) Any single character. WHERE PatIndex('_ean', au_fname) > 0 finds all four-letter first names that end with ean (Dean, Sean, and so on).
[ ] Any single character within the specified range ([a-f]) or set ([abcdef]). WHERE PatIndex('[C-P]arsen', au_lname) > 0 finds author last names ending with arsen and beginning with any single character between C and P (Carsen, Larsen, Karsen, and so on).
[^] Any single character not within the specified range ([^a-f]) or set ([^abcdef]). WHERE PatIndex('de[^l]%', au_lname) > 0 finds all author last names beginning with de and where the following letter is not l (finds Derry but not Delton).


Let's Get Practical
Let's take a real world example. If we have a varchar field that may contain non-numeric data and we want to Select the records and include the data in this field if it contains numbers only and nothing else, we can use PatIndex().

Background:
As the DBA for my company, my team supports all database needs for our company's applications including a web application. A key data field within this application is the customer's internal company ID for their employees. Since this ID is defined by the client, we must allow the use of alpha-numeric codes. A separate customer driven project calls for us to export their roster data to send to a third party. One of the requirements of the project is that if an employee's company ID contains data other than a number, we should assume that the value is incorrect and not include it.

Solution:
I decided to use PatIndex() in a Case statement with the not within a range wildcards. The range in this case would be 0 to 9 ( [0-9] ), and I would express not within this range by using the expression [^0-9]. My resulting Case statement looks like:

CompanyID = Case When PatIndex('%[^0-9]%', IsNull(CompanyID, '*')) > 0 Then Null Else CompanyID End

Going a little further:
If I want to go a little further with this concept, I could write a User-Defined Function ( named fnIsInt()) that accepts a varchar value and returns a 0 or a 1 indicating if the value could be converted to an integer data type as is without any other string manipulation. This function would work much like the built-in IsNumeric() function. The difference between the two functions is that IsNumeric() allows additional characters such as the decimal point, currency symbols, commas.

Since this article is focusing on practical uses of PatIndex, I will only be checking to see if the string value contains digits only with the exception of allowing it to begin with a negative sign. I will not include checking to see if the value is within the allowable range of the Int data type.



Create Function dbo.fnIsInt(
@value varchar(11))
Returns int
As
Begin
Declare @IsInt int
Set @IsInt = 0

If PatIndex('%[^0-9]%', @value) > 0 Or PatIndex('-%[^0-9]%', @value) > 0
Begin
Set @IsInt = 0
End
Else
Begin
Set @IsInt = 1
End

Return @IsInt
End


Testing the function:
Select IsNumeric('4'), PatIndex('%[^0-9]%', '4'), dbo.fnIsInt('4')



----------- ----------- -----------
1 0 1

Select IsNumeric('4.2'), PatIndex('%[^0-9]%', '4.2'), dbo.fnIsInt('4.2')


----------- ----------- -----------
1 2 0

Select IsNumeric('4-2'), PatIndex('%[^0-9]%', '4-2'), dbo.fnIsInt('4-2')


----------- ----------- -----------
0 2 0

Select IsNumeric('-4'), PatIndex('%[^0-9]%', '-4'), dbo.fnIsInt('-4')


----------- ----------- -----------
1 1 0

Select IsNumeric('4,2'), PatIndex('%[^0-9]%', '4,2'), dbo.fnIsInt('4,2')

----------- ----------- -----------
1 2 0

Select IsNumeric('$42'), PatIndex('%[^0-9]%', '$42'), dbo.fnIsInt('$42')

----------- ----------- -----------
1 1 0

The Big Deal?
Okay, I admit that I could have done the above Case statement just as easily using Like commands. Since I did not need to know the position of the offending characters, PatIndex() was more than was needed for that situation. If the project had required that I remove the non-digit characters rather than simply returning Null, PatIndex() would have been the perfect solution.

I can easily leverage the full power of PatIndex() by creating a User-Defined Function that accepts a varchar value and returns a varchar value with all non-digit characters removed.


Create Function dbo.fnDigitsOnly(
@value varchar(50))
Returns varchar(50)
As
Begin
If PatIndex('%[^0-9]%', @value) > 0
Begin
While PatIndex('%[^0-9]%', @value) > 0
Begin
Set @value = Stuff(@value, PatIndex('%[^0-9]%', @value), 1, '')
End
End

Return @value
End


The Big Finish
So, what's the difference between CharIndex() and PatIndex()? Well, like Books Online says, PatIndex() can use wildcards. To put it as simple as I can, PatIndex() combines the capabilities of the CharIndex() function and the Like command. 99.99% of the time, CharIndex() and PatIndex() are used interchangeably. That 0.01% of the time that you need something more, you'll be glad to have PatIndex() in your T-SQL toolbox.

Attach Database or Detach Database with SP

Introduction
If you need to move a database, or database file to another server or disk, and you do not want to backup the database, copy the backup on another server, and then re-create database from the backup, you can use the sp_detach_db and sp_attach_db system stored procedures to detach database and then attach it again. Detaching and attaching databases can be used when you want to move the database files to a different physical disk from the disk that run out of disk space and you want to expand the existing file rather than add a new file to the database on another disk.

To move a database using detach and attach, you should make the following steps:
• Detach the database.
• Move the database file(s) to the desired location on another server or disk.
• Attach the database specifying the new location of the moved file(s).

After detaching, the database will be removed from SQL Server but will be intact within the data and transaction log files that compose the database. You can use these data and transaction log files to attach the database to any instance of SQL Server, including the server from which the database was detached. After attaching, the database will be available in exactly the same state it was in when it was detached.

Detaching a DatabaseTo detach an SQL Server 2000 database, you can use the sp_detach_db system stored procedure. This stored procedure can also run UPDATE STATISTICS on all tables before detaching. The syntax for sp_detach_db system stored procedure is as follows:

sp_detach_db [ @dbname = ] 'dbname'
[ , [ @skipchecks = ] 'skipchecks' ]

In the above, the parameters to the stored procedure are as follows:
• [@dbname =] 'dbname' is the database name. 'dbname'is nvarchar(128), a default value is NULL.
• [@skipchecks =] 'skipchecks' The 'skipchecks' parameter indicates will be can UPDATE STATISTICS run or skipped. The 'skipchecks' is nvarchar(10), a default value is NULL. If 'skipchecks' is true, UPDATE STATISTICS is skipped. If 'skipchecks' is false, UPDATE STATISTICS is run.

The following example detaches the pubs database and run UPDATE STATISTICS on all tables before detaching:

EXEC sp_detach_db 'pubs', 'false'

Attaching a DatabaseWhen you attach a database, you must specify at least the name and physical location of the primary data file. If one or more of database files have changed location since the database was detached, you must specify the name and physical location of these files in addition to the primary file.

To attach SQL Server 2000 database, you can use the sp_attach_db system stored procedure. The syntax for sp_attach_db system stored procedure is as follows:

sp_attach_db [ @dbname = ] 'dbname',
[ @filename1 = ] 'filename_n' [ ,...16 ]

In the above command:
• [@dbname =] 'dbname'is the database name. dbname is nvarchar(128), a default value is NULL.
• [@filename1 =] 'filename_n'Is the database file name. filename_n is nvarchar(260), a default value is NULL. There can be up to 16 file names specified.

This is the example to attach the pubs database which contain two files pubs.mdf and pubs_log.ldf from the C:\MSSQL\Data directory:
EXEC sp_attach_db @dbname = 'pubs',
@filename1 = 'C:\MSSQL\Data\pubs.mdf',
@filename2 = 'C:\MSSQL\Data\pubs_log.ldf'

Attaching a Single-File DatabaseA single-file database is a database that have only one data file. When a database comprises only one data file, the database can be attached to an instance of SQL Server 2000 without using the transaction log file. When the data file will be attached, SQL Server will create a new transaction log file automatically.

To attach a single-file database, you can use the sp_attach_single_file_db system stored procedure. The syntax for sp_attach_single_file_db system stored procedure is as follows:

sp_attach_single_file_db [ @dbname = ] 'dbname'
, [ @physname = ] 'physical_name'

• [@dbname =] 'dbname' is the database name. 'dbname' is nvarchar(128), a default value is NULL.
• [@physname =] 'phsyical_name' is the database file name. 'phsyical_name' is nvarchar(260), a default value is NULL.

This is the example to attach only one data file of the pubs database from the C:\MSSQL\Data directory:
EXEC sp_attach_single_file_db @dbname = 'pubs',
@physname = 'C:\MSSQL\Data\pubs.mdf'

Troubleshooting Lost UsersWhen you move database to another server and there are existing users in the detached database, you can lose users after attaching database on the new server. For example, if you move the Sales database from the Product server to the Test server (for the test purposes) and the user Alex exists in the Sales database, you should manually link the relationship between the Alex user and the appropriate login on the Test server.

You can use the sp_change_users_login system stored procedure to link the specified user in the current database to the appropriate login. The following example links the user Alex in the current database to t he Alex login:

EXEC sp_change_users_login 'Update_One', 'Alex', 'Alex'

See SQL Server Books Online to get more information about the sp_change_users_login stored procedure.

Monday, April 17, 2006

SQL Server Replication Part 1

SQL Server replication
SQL Server replication allows database administrators to distribute data to various servers throughout an organization. You may wish to implement replication in your organization for a number of reasons, such as:

Load balancing. Replication allows you to disseminate your data to a number of servers and then distribute the query load among those servers.

Offline processing. You may wish to manipulate data from your database on a machine that is not always connected to the network.

Redundancy. Replication allows you to build a fail-over database server that’s ready to pick up the processing load at a moment’s notice.

In any replication scenario, there are 3 main components:
Publishers have data to offer to other servers. Any given replication scheme may have one or more publishers.

Distributor maintains the Distribution Database. The role of the distributor varies depending on the type of replication. Two types of Distributors are identified, remote distributor and Local distributor. Remote distributor is separate from publisher and is configured as distributor for replication. Local distributor is a server that is configured as publisher and distributor.

Subscribers are database servers that wish to receive updates from the Publisher when data is modified.

There’s nothing preventing a single system from acting in both of these capacities. In fact, this is often done in large-scale distributed database systems. Microsoft SQL Server supports three types of database replication. This article provides a brief introduction to each of these models, while future articles will explore them in further detail. They are:

Snapshot replication acts in the manner its name implies. The publisher simply takes a snapshot of the entire replicated database and shares it with the subscribers. Of course, this is a very time and resource-intensive process. For this reason, most administrators don’t use snapshot replication on a recurring basis for databases that change frequently. There are two scenarios where snapshot replication is commonly used. First, it is used for databases that rarely change. Second, it is used to set a baseline to establish replication between systems while future updates are propagated using transactional or merge replication.

Transactional replication offers a more flexible solution for databases that change on a regular basis. With transactional replication, the replication agent monitors the publisher for changes to the database and transmits those changes to the subscribers. This transmission can take place immediately or on a periodic basis.

Merge replication allows the publisher and subscriber to independently make changes to the database. Both entities can work without an active network connection. When they are reconnected, the merge replication agent checks for changes on both sets of data and modifies each database accordingly. If changes conflict with each other, it uses a predefined conflict resolution algorithm to determine the appropriate data. Merge replication is commonly used by laptop users and others who can not be constantly connected to the publisher.

What is a Correlated Subquery?

What is a Correlated Subquery?
A correlated subquery is a SELECT statement nested inside another T-SQL statement, which contains a reference to one or more columns in the outer query. Therefore, the correlated subquery can be said to be dependent on the outer query. This is the main difference between a correlated subquery and just a plain subquery. A plain subquery is not dependent on the outer query, can be run independently of the outer query, and will return a result set. A correlated subquery, since it is dependent on the outer query will return a syntax errors if it is run by itself.

A correlated subquery will be executed many times while processing the T-SQL statement that contains the correlated subquery. The correlated subquery will be run once for each candidate row selected by the outer query. The outer query columns, referenced in the correlated subquery, are replaced with values from the candidate row prior to each execution. Depending on the results of the execution of the correlated subquery, it will determine if the row of the outer query is returned in the final result set.

Using a Correlated Subquery in a WHERE Clause
Suppose you want a report of all "OrderID's" where the customer did not purchase more than 10% of the average quantity sold for a given product. This way you could review these orders, and possibly contact the customers, to help determine if there was a reason for the low quantity order. A correlated subquery in a WHERE clause can help you produce this report. Here is a SELECT statement that produces the desired list of "OrderID's":

select distinct OrderId from Northwind.dbo.[Order Details] OD
where Quantity
from Northwind.dbo.[Order Details]
where OD.ProductID = ProductID)

The correlated subquery in the above command is contained within the parenthesis following the greater than sign in the WHERE clause above. Here you can see this correlated subquery contains a reference to "OD.ProductID". This reference compares the outer query's "ProductID" with the inner query's "ProductID". When this query is executed, the SQL engine will execute the inner query, the correlated subquery, for each "[Order Details]" record. This inner query will calculate the average "Quantity" for the particular "ProductID" for the candidate row being processed in the outer query. This correlated subquery determines if the inner query returns a value that meets the condition of the WHERE clause. If it does, the row identified by the outer query is placed in the record set that will be returned from the complete T-SQL SELECT statement.

The code below is another example that uses a correlated subquery in the WHERE clause to display the top two customers, based on the dollar amount associated with their orders, per region. You might want to perform a query like this so you can reward these customers, since they buy the most per region.

select CompanyName, ContactName, Address, City, Country, PostalCode
from Northwind.dbo.Customers OuterCwhere CustomerID in
( select top 2 InnerC.CustomerId from Northwind.dbo.[Order Details] OD join Northwind.dbo.Orders O on OD.OrderId = O.OrderID join Northwind.dbo.Customers InnerC on O.CustomerID = InnerC.CustomerId Where Region = OuterC.Region
group by Region, InnerC.CustomerId
order by sum(UnitPrice * Quantity * (1-Discount)) desc ) order by Region

Here you can see the inner query is a correlated subquery because it references "OuterC", which is the table alias for the "Northwind.DBO.Customer" table in the outer query. This inner query uses the "Region" value to calculate the top two customers for the region associated with the row being processed from the outer query. If the "CustomerID" of the outer query is one of the top two customers, then the record is placed in the record set to be returned.

Correlated Subquery in the HAVING Clause
Say your organizations wants to run a yearlong incentive program to increase revenue. Therefore, they advertise to your customers that if each order they place, during the year, is over $750 you will provide them a rebate at the end of the year at the rate of $75 per order they place. Below is an example of how to calculate the rebate amount. This example uses a correlated subquery in the HAVING clause to identify the customers that qualify to receive the rebate. Here is my code for this query:

select C.CustomerID, Count(*)*75 Rebate from Northwind.DBO.Customers C join Northwind.DBO.Orders O on c.CustomerID = O.CustomerID where Datepart(yy,OrderDate) = '1998' group by C.CustomerId having 750 < orderid =" OD.OrderID" customerid =" C.CustomerId">

By reviewing this query, you can see I am using a correlated query in the HAVING clause to calculate the total order amount for each customer order. I use the "CustomerID" from the outer query and the year of the order "Datepart(yy,OrderDate)", to help identify the Order records associated with each customer, that were placed the year '1998'. For these associated records I am calculating the total order amount, for each order, by summing up all the "[Order Details]" records, using the following formula:
sum(UnitPrice * Quantity * (1-Discount)). If each and every order for a customer, for year 1998 has a total dollar amount greater than 750, I then calculate the Rebate amount in the outer query using this formula "Count(*)*75 ".
SQL Server's query engine will only execute the inner correlated subquery in the HAVING clause for those customer records identified in the outer query, or basically only those customer that placed orders in "1998".

Performing an Update Statement Using a Correlated Subquery
A correlated subquery can even be used in an update statement. Here is an example:

create table A(A int, S int)
create table B(A int, B int)
-----------
set nocount on
insert into A(A) values(1)
insert into A(A) values(2)
insert into A(A) values(3)
insert into B values(1,1)
insert into B values(2,1)
insert into B values(2,1)
insert into B values(3,1)
insert into B values(3,1)
insert into B values(3,1)
update A set S = (select sum(B) from B where A.A = A group by A)
-----------
select * from A
-----------
drop table A,B

Here is the result set I get when I run this query on my machine:
A S
----------- -----------
1 1
2 2
3 3

In my query above, I used the correlated subquery to update column A in table A with the sum of column B in table B for rows that have the same value in column A as the row being updated.

Correlated in Condition:

AND EXISTS (SELECT TOP 1 1 FROM Table1 where {expression})


(Taken From Databasejournal article by Gregory A. Larsen)