Tuesday, 19 June 2012
Tuesday, 20 March 2012
SSIS various tasks description
A task
can best be described as an individual unit of work. We can also develop our own tasks, but here are the current ETL Tasks available
us to out-of-the-box:
ActiveX Script Task: Executes an ActiveX script in SSIS package. This task is
only to facilitate conversion of legacy DTS packages that use this deprecated
scripting method.
Analysis Services Execute DDL Task: Executes a DDL Task in Analysis Services. For example, this
can create, drop, or alter a cube (Enterprise and Developer Editions only).
Analysis Services Processing Task: This task processes a SQL Server Analysis Services cube, dimension,
or mining model.
Bulk Insert Task: Loads data into a table by using the BULK INSERT SQL command.
Data Flow Task: This very specialized task loads and transforms data into an
OLE DB, and now, optionally, an ADO.NET Destination.
Data Mining Query Task: Allows us to run predictive queries against Analysis
Services data - mining models.
Data Profiling Task: This exciting new task allows for the examination of data to
replace our ad - hoc data profiling techniques.
Execute DTS 2000 Package Task: Exposes legacy SQL Server 2000 DTS packages to our SSIS package.
Execute Package Task: Allows us to execute a package from within a package, making
our SSIS packages modular.
Execute Process Task: Executes a program external to our package, such as one to
split our extract file into many files before processing the individual files.
Execute SQL Task: Executes a SQL statement or stored procedure.
File System Task: This task can handle directory operations such as creating,
renaming, or deleting a directory. It can also manage file operations such as
moving, copying, or deleting files.
FTP Task: Sends
or receives files from an FTP site.
Message Queue Task: Sends or receives messages from a Microsoft Message Queue
(MSMQ).
Script Task: This task allows us to perform more .NET - based scripting
in the Visual Studio Tools for Applications programming environment.
Send Mail Task: Sends a mail message through SMTP.
Web Service Task: Executes a method on a Web service.
WMI Data Reader Task: This task can run WQL queries against the Windows Management
Instrumentation. This allows us to read the event log, get a list of
applications that are installed, or determine hardware that is installed, to
name a few examples.
WMI Event Watcher Task: This task empowers SSIS to wait for and respond to certain
WMI events that occur in the operating system.
XML Task: Parses or processes an XML file. It can merge, split, or
reformat an XML file.
If your database user login is assigned to the db_ssisoperator role, which has only read access and does not have write access to a package stored in msdb, what are you able to do with the package?
Ans. Because you are assigned
the db_ssisoperator role, you can execute the package inside msdb, but you
cannot delete the package from the msdb store or import packages into the store
What information stored in the package definition would cause a security risk if it were found?
Ans. Although a package fi le
does not contain data, it does contain the schema details about input sources
and destinations. Even if these sources and destinations cannot be accessed,
this information can be a security risk because it exposes table and column names.
Can you configure the security of a package so that you can store a password in clear text in the underlying .dtsx fi le in the file system?
Ans. no, sensitive information such as connection passwords cannot
be stored as clear text in the package fi le in the fi le system. The only way
to store the connection password in the fi le is to encrypt the sensitive
information by setting the ProtectionLevel to EncryptSensitiveWithPassword or
EncryptSensitiveWithuser- Key. A better choice is to set the ProtectionLevel to
DontSaveSensitive and use an SSIS confi guration to store the connection
password.
Friday, 16 March 2012
To run a stored procedure within a SQL Server database, what task would you choose?
The Execute SQL Task can run a stored procedure within SQL
Server or any relational database for which you have an installed data
provider. The syntax of the statement entered in the Execute SQL ask
will be in the native language of the underlying database.
Thursday, 15 March 2012
Microsoft SQL-Server 2012 New Features
Microsoft SQL-Server 2012 New Features
Sequences:
Sequences have been
requested by the SQL Server community for years, and it's included in this
release. Sequence is a user defined object that generates a sequence of a
number. Here is an example using Sequence.
/****** Create Sequence Object ******/
CREATE SEQUENCE Seqexample
START WITH 1
INCREMENT BY 1;
CREATE SEQUENCE Seqexample
START WITH 1
INCREMENT BY 1;
/****** Create Temp Table ******/
DECLARE @Emp TABLE
(
ID int NOT NULL PRIMARY KEY,
Fname varchar(100) NOT NULL
);
DECLARE @Emp TABLE
(
ID int NOT NULL PRIMARY KEY,
Fname varchar(100) NOT NULL
);
/****** Insert Some Data ******/
INSERT @ Emp (ID, Fname)
VALUES (NEXT VALUE FOR Seqexample, ‘Satya Abc’),
(NEXT VALUE FOR Seqexample, ‘Satya Mno’),
(NEXT VALUE FOR Seqexample, ‘Satya Xyz’);
INSERT @ Emp (ID, Fname)
VALUES (NEXT VALUE FOR Seqexample, ‘Satya Abc’),
(NEXT VALUE FOR Seqexample, ‘Satya Mno’),
(NEXT VALUE FOR Seqexample, ‘Satya Xyz’);
/****** Show the Data ******/
SELECT * FROM @ Emp;
SELECT * FROM @ Emp;
The results would look like this:
ID Fname
1 Satya Abc
2 Satya Mno
3 Satya Xyz
1 Satya Abc
2 Satya Mno
3 Satya Xyz
Ad-Hoc Query Paging:
Paging results in SQL Server has been
discussed for years. The Order By option in the SQL SELECT statement has been
enhanced in SQL Server 2012. Using a combination of OFFSET and FETCH along with
ORDER BY gives you control of paging through a result set. Using this technique
can really help performance by bring back only the results you want to show to
your users when they are needed. The following TSQL code runs against the
Person table in the AdventureWorks sample database (available from Microsoft).
In the sample query below, SQL Server would return 10 records beginning with
record 11. The OFFSET command provides a starting point for the SELECT
statement in terms of paging, and the FETCH command provides how many records
to return at a time.
SELECT BusinessEntityID, FirstName, LastName
FROM Person.Person
ORDER BY BusinessEntityID
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY;
FROM Person.Person
ORDER BY BusinessEntityID
OFFSET 10 ROWS
FETCH NEXT 10 ROWS ONLY;
Availability Enhancements including AlwaysOn SQL Server Failover
Cluster Instances, AlwaysOn Availability Groups and enhancements to Online
Operations for online index builds (where index columns contain LOB data);
Scalability and Performance Enhancements including a new data warehouse query
acceleration feature based on a new type of index called the column store;
Full Text Search: The Full Text Search in SQL Server 2012 has
been enhanced by allowing you to search and index data stored in extended
properties or metadata. Consider a PDF document that has "properties"
filled in like Name, Type, Folder path, Size, Date Created, etc. In the newest
release of SQL Server, this data could be indexes and searched along with the
data in the document itself. The data does have to be exposed to work, but it's
possible now.
SQL Server Express LocalDB — a new lightweight edition of Express
that has all its programmability features
FileTables — extends FILESTREAM technology and
allows you to now store files and documents in special tables in SQL Server,
but access them from Windows applications as if they were stored in the file
system, without making any changes to the Windows applications. File and
directory data is exposed through a Windows share.
Statistical
Semantic Search — provides deep
insight into unstructured documents stored in SQL Server databases by
extracting statistically relevant key phrases (allowing you to query the
meaning of a document);
New built-in T-SQL
functions: Conversion functions
now include PARSE, TRY_PARSE, TRY_CONVERT; Date and time functions now include
DATEFROMPARTS, DATETIME2FROMPARTS, DATETIMEFROMPARTS, DATETIMEOFFSETFROMPARTS,
EOMONTH, SMALLDATETIMEFROMPARTS, TIMEFROMPARTS; Logical functions now include
CHOOSE (gets item at the specified index from a list of values), IIF (a
shorthand if-then construct already common in other languages); and String
functions now include CONCAT and FORMAT;
ORDER BY clause improvements using OFFSET and
FETCH (limit result set rows to a range) — this feature should come in very handy for all
sorts of ad-hoc queries and/or web-applications that implement
"paging" through data! Finally, less custom-code for this!
THROW statement can be used to raise an exception and
transfer execution to a CATCH block of a TRY…CATCH construct;
OVER clause has been extended to support
window functions which will
reduce the need for custom code to compute things like moving averages or cumulative
totals;
Additional analytic functions like: FIRST_VALUE, LAST_VALUE, , LAG (previous rows
access without self-JOIN), LEAD (subsequent rows access without self-JOIN),
CUME_DIST, PERCENTILE_CONT, PERCENTILE_DISC, PERCENT_RANK
EXECUTE statement can now specify the metadata returned
from the statement by using the WITH RESULT SETS argument;
New and Enhanced Query Optimizer Hints: FORCESEEK includes further options and a new
FORCESCAN hint;
Native XML Web Services (SOAP/HTTP endpoints)
is Removed;
Finally! ANSI JOIN syntax is required! No more "*= and =*" join
syntax, which hopefully you migrated away from years ago;
FASTFIRSTROW (query hint) is replaced with OPTION (FAST n);
COMPUTE / COMPUTE BY (T-SQL) — use ROLLUP instead;
SET ROWCOUNT for INSERT, UPDATE, and DELETE
statements are deprecated and
will not be supported in the next version of SQL-Server. Use the TOP keyword
instead;
Wednesday, 14 March 2012
Hardware and Software Requirements to Install SQL Server 2012 (Denali)
·
Microsoft recommends using NTFS file format
instead of FAT32. FAT32 will work but you should probably not use it.
·
You can't install SQL Server 2012 (code-named
Denali) on mapped drives or compressed drives.
·
You have to have the "no-reboot"
package installed prior to installing SQL Server 2012 (code-named Denali). This
is included in Windows 7 SP1 and Windows Server 2008 R2. Otherwise, you can
download the no-reboot package from Microsoft.
·
SQL Server 2012 (code-named Denali) requires the
.NET Framework 4.0.
·
Virtualization is supported using Microsoft's
Hyper-V technology.
·
You will need at least 3.6 GB of free disk
space.
·
Microsoft recommends that you do not install SQL
Server 2012 (code-named Denali) on a domain controller.
Recommended Processors & RAM
·
64-bit version: AMD Opteron, AMD Athlin 64,
Intel Xeon with Intel EM64T Support or Intel Pentium IV with EM64T support
running 2.0 GHz or faster. Recommended RAM is maximum that operating system
supports or at least 2 GB.
·
32-bit version: Pentium III or compatible
running at 2.0 GHz of faster. Recommended RAM is maximum that operating system
supports or at least 2 GB.
·
Windows PowerShell 2.0 is a pre-requisite for
installing SQL Server 2012 (code-named Denali). You can get this software from
the Windows Management Framework page.
New features are available in SSIS Denali(2012) over 2008 R2
GUI
IMPROVEMENTS
•
Getting started window
•
Package visualization
•
Zoom
•
Undo
•
SSIS toolbox
•
Data flow source/destination wizard
•
Sort packages by name
•
Grouping in data flow
CHANGE DATA
CAPTURE
•
Incremental load loads all rows that have
changed since the last load
•
How do we know what has changed?
o Compare
every source row with every destination row
o Last
modified date and a trigger to maintain this
o Change
tracking
o Change
data capture!
•
SQL Server Enterprise edition, 2008 or higher
•
Asynchronous process
•
Captures all changes
•
Maintains time window
•
CDC data access via table valued functions Books
online, change data capture
CDC TASK AND
COMPONENTS
•
CDC needs to keep track of which changes have
already been processed
•
CDC task does this by storing LSNs in a tracking
table
•
CDC Source component reads from the CDC table
function, based on the LSN it got from the CDC task
•
CDC transformation splits records into new rows,
updated rows and deleted rows
•
No documentation yet in RC0, check Matt Masson’s
blog
•
Based on Attunity CDC components
MAPPING DATA
FLOW COLUMNS
•
When modifying a data flow, column remapping is
sometimes needed
•
SSIS 2012 maps columns on name instead of id
•
It also has an improved remapping dialog
ODBC SOURCE AND
DESTINATION
•
ODBC was not natively supported in 2008
•
SSIS 2012 has ODBC Source & Destination
o Handy
for connecting to SQL Azure
o Essential
if SQL Server stops supporting OleDb
•
SSIS 2008 could access ODBC via ADO.Net:
o Has
create table option, which ODBC lacks
o No
control on batch inserts nr of rows ODBC ADO.Net % Diff
o Low
performance
SCRIPTING
•
Script task and script component now support
.Net 4.0
•
Breakpoints are supported in script component
•
When developing custom components, there is
better backpressure support:
o SupportsBackPressure
property, IsInputReady and GetDependantInputs method
EXPRESSION TASK
•
The script task can be used to modify variable
values… but it’s overkill• Expression task provides a simple task to change
variable values
DATA QUALITY
SERVICES (DQS)
•
DQS is a new service to clean domain data
•
Domain knowledge base needs to be build
o Based
on rules, positive and negative examples
o Potentially
using external data from Azure Marketplace or other providers
DQS CLEANSING
TASK
•
Cleaning and standardizing data before it is
loaded in the data warehouse is essential
•
DQS Cleansing task labels data in 4 categories:
o Correct:
a value accepted by the knowledge base
o Corrected:
a value on which DQS is confident it can correct to a valid domain value
o Suggested:
a value on which DQS is less confident, but can still suggest a domain value
o New:
DQS has no suggestions for this
PACKAGE CATALOG
•
SSIS 2012 can work in the new project mode
(default) or in old package mode (backwards compatibility)
•
In project mode, many things change:
o Project
becomes the level of deployment
o Deployment
to SQL Server becomes obligatory
o Packages
not stored in msdb, but in dedicated user database:
·
The package catalog, named SSISDB
o Logging
happens automatically and is done in the package catalog
·
Custom logging still supported
•
Projects can be converted from one deployment
type to another
•
Manage via SSMS: Relational engine• Fixed
database name: SSISDB
•
Stores projects, versions, logs, 5 reports, 25
views, 42 stored procedures, …
•
This makes it possible to run, monitor and
manage SSIS projects and packages via T-SQL!
PARAMETERS
•
Just two scopes:
o Package
o Project!
•
Read-only
o Value
is set when scope starts and cannot be changed
o Can
be set from SQL Server Data Tools configurations
•
Often used together with environments
•
Does not replace variables
o a package configuration replacement
•
Using the visual studio (SSDT) configurations we
can configure default values for testing
SHARED
CONNECTION MANAGERS
•
Shared connection manager is defined at project
level and is automatically available in every package
o Not
copied as in SSIS 2008
•
Shared connection managers can be parameterized
as well
•
When converting shared connection managers back
to regular (package) connection managers, they disappear in all other packages
•
Shared cache connection managers are supported
as well
o This
allows to cache data in memory in one package and reuse it in multiple other
packages
ENVIRONMENTS
•
Environments replace package configurations
•
They can control parameter values and connection
strings
•
Environments are created in the package catalog
o They
are not deployed to the server, but created on the server
o Don’t
forget to reference the environment at the project level
o Script
them while creating, this eases creating multiple environments
•
A server might have multiple environments
o When
we execute a package, we can choose which environment we’ll use
DATA TAPS
•
Imagine a data viewer
o Which
can be added on the runtime server
o Without
modifying the package, but using T-SQL
o Which
writes the data to disk instead on visualizing it…
•
Voila, you are now thinking about the data tap
More
Enhancements
•
.Net API and Powershell
•
Pivot and row count transformation get a user
interface
•
Flat file supports
o Embedded
qualifiers
o Variable
number of columns (but still fixed meta-data)
•
Raw file improvements
o Generate
empty raw file
o Stores
sort info
•
DTSX files are becoming more readable and
‘mergeable’
•
Merge and merge join improve backpressure handling
•
4000 char expression length lifted
•
New expression language keywords
o LEFT
as syntactic sugar for SUBSTRING(,1,)
o TOKEN
and TOKENCOUNT for shredding strings
New features are available in SSAS Denali(2012) over 2008 R2
·
The new release is based on the vision to reach
a broader user base
·
SSAS brings together the existing multi-dimensional model with the new tabular
model pioneered by PowerPivot under a single unified platform – the BI Semantic
Model
·
Data Mining Component remains the same
·
The BI Semantic Model promises one model for all
end user experiences
o End
Users/Personal BI – PowerPivot for Excel
o Team
BI – PowerPivot for SharePoint
o Coporate
BI – Analysis Services
·
What is The BI Semantic Model? It is just
another term for a Data Model hosted within Analysis Services
·
There are three approaches in developing a BI
Semantic Model
o PowerPivot
Workbook
§ Design
Type: Tabular
§ Tool:
Excel 2010
§ Biz
Logic: DAX
§ Data
Access-Cache Passthrough: In Memory
§ Deployment:
SharePoint Library/Analysis Services/PowerPivot
o Tabular
Project
§ Think
of this as PowerPivot for the IT Pro. End user hands off the PowerPivot
workbook to IT department.
§ Design
Type: Tabular
§ Tool:
SQL Server Data Tool
§ Biz
Logic: DAX
§ Data
Access-Cache Passthrough: In Memory Direct Query
§ Deployment:
Analysis Services Tabular Mode
o Multidimensional
Project
§ Design
Type: Multidimensional
§ Deployment:
SQL Server Data Tools
§ Biz
Logic: MDX
§ Data
Access-Cache Passthrough: MOLAP/ROLAP
§ Deployment:
Analysis Services Multidimenional
o Front
End Clients
§ PowerView
(uses DAX to query the BI Semantic Model)
§ 3rd
party apps
§ Reporting
Services
§ Excel
§ PowerPivot
§ SharePoint
Insights (PerformancePoint Services)
·
Note that Tabular Models understand both DAX and
MDX
·
DAX is new SQL Server 2012 and only works
against tabular models
·
The advantage of a tabular model is that we can
now work with:
o Files
o OData
Feeds
o Cloud
Services
o Relational
Databases etc
·
BI Sematic Model supports Fine-grained security
row and cell level
·
The following templates are available through
SQL Server Data tools:
o Analysis
Services Multidimensional and Data Mining Project
o Import
from Server (Multidimensional and Data Mining)
o Analysis
Services Tabular Project
o Import
from PowerPivot
o Import
from Server (Tabular)
·
Multidimensional Model New Features
o 300+
improvements
o New
Developer tool: SQL Server Data tools (Visual Studio 2010)
o Removal
of 4G string limit for attributes
o New
events for monitoring locks and attributes
o Support
for XEvents for low overhead eventing support
o PowerShell
support
·
Installation and Configuration
o Services instances are installed with SQL
Server Installer
§ Server
modes: Multidimensional and Tabular – both can be installed on the same server
o PowerPivot
Configuration tool is used to configure/repair a PowerPivot instance
·
New PowerShell cmdlets for AMO for command-line
connectivity, navigation and discovery of SSAS databases
·
Existing apps – every UDM
becomes (can be upgraded to) a BI Semantic Model
Tuesday, 13 March 2012
New features are available in SSRS Denali(2012) over 2008 R2
There's not much new in Denali for SSRS. I vaguely recall reading that the SSRS team got most of their big enhancements into 2008 R2 instead of needing to delay to Denali.
1. That said, BIDS will be Visual Studio 2010. That's worth a lot on its own.
2. Word and Excel export will finally support Office OpenXML file formats (.xlsx and .docx) which will remove some limitations from the previous formats. The biggest of these are the row and column counts for Excel exports.
3. Then Project Crescent that you mentioned: self service Silverlight "Report" story authoring. I'm not sure this will really be part of SSRS proper; I'm not sure where it will fit.
4. I expect minor enhancements of the existing report objects, but I don't think there will be much new for SSRS in Denali.
What is the difference between a control flow task and a control flow container?
Control flow tasks perform operations, whereas containers coordinate and group tasks.
Example: a foreach Loop Container can look through the fi les in a system folder, and a file System Task embedded within the container can then move the files to a new folder location.
SSIS New Features in SQL Server 2008
Scripting Task
SQL Server 2008 scripting is built on the improved VSTA (Visual Studio Tools for Applications) rather than the VSA (Visual Studio for Applications) scripting engine. VSTA is allows us to build scripts in both C# and VB.NET.
Data Profiling Task
The Data Profiling Task is new in 2008 and provides a way to analyze data for discovering data issues, constraints, patterns, distributions, and other data profile characteristics
ADO.NET Support
Integration Services 2008 introduces an ADO.NET source and destination adapters. ADO.NET adapters are full-featured adapters and provide the full custom UI experience. Because there are ADO.NET ODBC adapters, by nature of supporting ADO.NET, Integration Services now also supports connectivity through ODBC.
Wizards
Integration Services 2008 introduces a new wizard called the Integration Services Connections Project Wizard that allows us to create connections, data flow, and a package as we create the project. This is a faster way to create a package
Improved Data Flow Lookup Component
The new Lookup Component is perhaps the most improved of all features in 2008. It now allows us to source the reference data from just about anywhere including another data flow task. The Lookup Component also allows us to store the reference data in a cache thus drastically reducing the load times.
Improved Data Flow Task
The Data Flow Task core processing has been improved to better utilize threads for more efficient and scalable execution trees. In certain data flow packages, the 2005 Data Flow Task could get into blocking situations or simply not utilize multiple processors as efficiently as expected.
Subscribe to:
Posts (Atom)