Friday, October 17, 2008

SSIS Multithreading (Parallel execution in SSIS)

I have always been curious how I could control the parallelism of different flows in Integration Services Packages. But I was not able to find about it earlier. Microsoft gives us enough ability to execute our packages concurrently.
There are a few properties of package (Just right click control flow designer and select Properties).

MaxConcurrentExecutables:
This specifies the maximum number of execution flows allowed in parallel at any point in the package. It specifies the maximum number of threads that a package can create for concurrent execution. Remember that we can not specify 0 (Zero) for this property because this would mean zero thread meaning no execution flow.

The default value is -1. This means 2 more than number of processors.

You might be wondering about scenarios where the number of parallel execution flows might be greater than the number of threads specified. In all such cases, the remaining threads would have to wait until the time any thread completes execution.

EngineThreads:
This property is used to incorporate parallelism into data flow tasks. It has a default value 10. Be careful in specify the value. This is because this property, along with MaxConcurrentExecutables, decides the number of parallel execution flows in a package simultaneously.

Additionally two properties of Dataflow tasks can be used to specify Temporary storage to any faster disks. These properties are BLOBTempStoragePath and BufferTemppStoragePath.

Tuesday, October 14, 2008

Online indexes in SQL Server

Do you know that normally when you create an index on an existing table, you can not perform any query or updating while creating index on a table. This has always been like that before SQL Server 2005. With SQL Server 2005, Microsoft introduced ONLINE indexes in SQL Server. It must be known that online does not mean faster index creation but quite the contrary it is slower than its other counterpart.

In order to allow modification and querying data during the index creation, SQL Server maintains versions of rows in a Version Store. You can check size of version store by using VERSION STORE SIZE counter in performance monitor.

Now you have two options for creating indexes. They are as follows:
1. Online
2. Offline

It must be remembered that online indexes are only available in Enterprise edition of SQL Server.

With offline indexes any DML operations is not allowed. This is done by acquiring EXCLUSIVE lock on the table or indexed view. So for the whole duration of creation of index, no DML operation (including select) is allowed on the object.

There are three phases of creation of an online index. They are as follows:

1. Preparation: In preparation phase, following tasks are carried out:
a. Acquiring of lock mainly S and IS lock on table.
b. Creation of metadata for index.
c. Recompiling DML plans by incrementing version number.

2. Build

3. Final

For creating an index there must be sufficient temporary space available in disk. This space is required for sorting and other intermediate tasks. Extra space is used from disk when SQL Server does not find necessary space in primary memory to hold the temporary data. When SQL Server is using the disk space, it may acquire the space from user database if SORT_IN_TEMPDB option is not set for index creation or modification. There are certain advantages for using tempdb for this purpose. With this option most contiguous space is allocated for index which makes queries faster on the index. Additionally it uses tempdb transaction log.

When creating an index it is also to be made sure that transaction log has sufficient disk space as indexing creating large data loads which might fill the transaction log. It is best if you change the recovery model to SIMPLE or BULK_LOGGED before starting the index operation.

If an index is fragmented then it is either reorganized or rebuilt with ALTER INDEX REORGANIZE or ALTER INDEX REBUILD command. It must be remembered that reorganizing an index is always an online operation but rebuilding an index can be online or offline. For rebuild an index is dropped and a new index is created.

Index creation can also be further controlled with providing hints for maximum degree of parallelism in the CREATE or ALTER index statement. But keep in mind that as the degree of parallelism increases the requirement of memory also increases. So make sure that you have enough memory available.

CREATE INDEX IX_MyIndex
ON MyDB.MyTable (MyID)
WITH (MAXDOP=8);


Note: MAXDOP option can not be specified in ALTER INDEX REORGANIZE statement. Currently, it is always a single threaded operation. You can also specify MAXDOP in sp_configure command. This would apply the parallelism setting on all operations of the server instance.

Saturday, October 4, 2008

SQL Server Browser Service to add connection in Visual Studio

When you install SQL Server then adding a connection in Visual Studio .net does not make the server available for selection. This is because SQL Server Browser Server is not running. Just follow the following steps to understand the problem.

Add a connection in Server Explorer:




Add connection in Server Explorer:



Select SQL Server from the selection list. When you try to select the servers then no server is available. Now you may think that you have SQL Server installed then why not those servers are available for selection.


Just start the SQL Server Browser Service. This should solve the problem.


Now you may see that the servers become available in the selection list in Visual Studio.

Friday, October 3, 2008

Windows Power Shell Basics

You must be using command shell but are not satisfied with many limitations with this shell. One of the limitations is not able to input or output objects in or out of the applications. Now you are gifted with one more shell by Microsoft which is Power Shell. It also has a new scripting language, which is called Power Shell Scripting.

It must be remembered that Windows Power Shell has been built on .net. So you can deal with objects i.e. it can process and return objects as well compared to earlier tools which could only return text.

Windows Power Shell introduced the concept of cmdlet. These are small program designed to perform simple task. They can be combined to perform complex tasks. The tasks may be accessing directories and files, registries, digital certificates etc.

The good thing is that you can use the existing tools like Net and Reg.exe being in the environment of Power Shell. Various Cmd.exe commands are supported inside Windows Power Shell Environment.





The names of cmdlet(s) are very easy to remember. It is always Verb followed by hyphen and then a noun afterwards. The examples are ‘Get-Help’.

We can get help about any cmdlet by using Get-Help. The format is as follows:
Get-Help get-help -detailed

One thing to remember is that Windows Power Shell is not case-sensitive.