GUI
IMPROVEMENTS
•
Getting started window
•
Package visualization
•
Zoom
•
Undo
•
SSIS toolbox
•
Data flow source/destination wizard
•
Sort packages by name
•
Grouping in data flow
CHANGE DATA
CAPTURE
•
Incremental load loads all rows that have
changed since the last load
•
How do we know what has changed?
o Compare
every source row with every destination row
o Last
modified date and a trigger to maintain this
o Change
tracking
o Change
data capture!
•
SQL Server Enterprise edition, 2008 or higher
•
Asynchronous process
•
Captures all changes
•
Maintains time window
•
CDC data access via table valued functions Books
online, change data capture
CDC TASK AND
COMPONENTS
•
CDC needs to keep track of which changes have
already been processed
•
CDC task does this by storing LSNs in a tracking
table
•
CDC Source component reads from the CDC table
function, based on the LSN it got from the CDC task
•
CDC transformation splits records into new rows,
updated rows and deleted rows
•
No documentation yet in RC0, check Matt Masson’s
blog
•
Based on Attunity CDC components
MAPPING DATA
FLOW COLUMNS
•
When modifying a data flow, column remapping is
sometimes needed
•
SSIS 2012 maps columns on name instead of id
•
It also has an improved remapping dialog
ODBC SOURCE AND
DESTINATION
•
ODBC was not natively supported in 2008
•
SSIS 2012 has ODBC Source & Destination
o Handy
for connecting to SQL Azure
o Essential
if SQL Server stops supporting OleDb
•
SSIS 2008 could access ODBC via ADO.Net:
o Has
create table option, which ODBC lacks
o No
control on batch inserts nr of rows ODBC ADO.Net % Diff
o Low
performance
SCRIPTING
•
Script task and script component now support
.Net 4.0
•
Breakpoints are supported in script component
•
When developing custom components, there is
better backpressure support:
o SupportsBackPressure
property, IsInputReady and GetDependantInputs method
EXPRESSION TASK
•
The script task can be used to modify variable
values… but it’s overkill• Expression task provides a simple task to change
variable values
DATA QUALITY
SERVICES (DQS)
•
DQS is a new service to clean domain data
•
Domain knowledge base needs to be build
o Based
on rules, positive and negative examples
o Potentially
using external data from Azure Marketplace or other providers
DQS CLEANSING
TASK
•
Cleaning and standardizing data before it is
loaded in the data warehouse is essential
•
DQS Cleansing task labels data in 4 categories:
o Correct:
a value accepted by the knowledge base
o Corrected:
a value on which DQS is confident it can correct to a valid domain value
o Suggested:
a value on which DQS is less confident, but can still suggest a domain value
o New:
DQS has no suggestions for this
PACKAGE CATALOG
•
SSIS 2012 can work in the new project mode
(default) or in old package mode (backwards compatibility)
•
In project mode, many things change:
o Project
becomes the level of deployment
o Deployment
to SQL Server becomes obligatory
o Packages
not stored in msdb, but in dedicated user database:
·
The package catalog, named SSISDB
o Logging
happens automatically and is done in the package catalog
·
Custom logging still supported
•
Projects can be converted from one deployment
type to another
•
Manage via SSMS: Relational engine• Fixed
database name: SSISDB
•
Stores projects, versions, logs, 5 reports, 25
views, 42 stored procedures, …
•
This makes it possible to run, monitor and
manage SSIS projects and packages via T-SQL!
PARAMETERS
•
Just two scopes:
o Package
o Project!
•
Read-only
o Value
is set when scope starts and cannot be changed
o Can
be set from SQL Server Data Tools configurations
•
Often used together with environments
•
Does not replace variables
o a package configuration replacement
•
Using the visual studio (SSDT) configurations we
can configure default values for testing
SHARED
CONNECTION MANAGERS
•
Shared connection manager is defined at project
level and is automatically available in every package
o Not
copied as in SSIS 2008
•
Shared connection managers can be parameterized
as well
•
When converting shared connection managers back
to regular (package) connection managers, they disappear in all other packages
•
Shared cache connection managers are supported
as well
o This
allows to cache data in memory in one package and reuse it in multiple other
packages
ENVIRONMENTS
•
Environments replace package configurations
•
They can control parameter values and connection
strings
•
Environments are created in the package catalog
o They
are not deployed to the server, but created on the server
o Don’t
forget to reference the environment at the project level
o Script
them while creating, this eases creating multiple environments
•
A server might have multiple environments
o When
we execute a package, we can choose which environment we’ll use
DATA TAPS
•
Imagine a data viewer
o Which
can be added on the runtime server
o Without
modifying the package, but using T-SQL
o Which
writes the data to disk instead on visualizing it…
•
Voila, you are now thinking about the data tap
More
Enhancements
•
.Net API and Powershell
•
Pivot and row count transformation get a user
interface
•
Flat file supports
o Embedded
qualifiers
o Variable
number of columns (but still fixed meta-data)
•
Raw file improvements
o Generate
empty raw file
o Stores
sort info
•
DTSX files are becoming more readable and
‘mergeable’
•
Merge and merge join improve backpressure handling
•
4000 char expression length lifted
•
New expression language keywords
o LEFT
as syntactic sugar for SUBSTRING(,1,)
o TOKEN
and TOKENCOUNT for shredding strings