Fastest Way to retrieve Data from Data Warehouse in Excel via VBA Macro

Use a code similar to this one in VBA:

Include following References: Microsoft ActiveX Data Objects 6.* Library

Be sure to return values as “datetime” instead of “date” or “datetime2” to get recognized as a date value in Excel.

Public Sub GetDwhData()
    Dim wksTarget As Worksheet: Set wksTarget = Worksheets(1)
    Dim cellAddressStart As String: cellAddressStart = "B2"
    
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    
    On Error GoTo ErrHandling
    
    Set objMyConn = New ADODB.Connection
    Set objMyRecordset = New ADODB.Recordset
    Dim strSQL As String

    objMyConn.ConnectionString = "Provider=SQLOLEDB;Data Source=tcp:localhost,1344;Initial Catalog=DWH;Integrated Security=SSPI"
    objMyConn.Open

    strSQL = "select TOP 1000 * from dbo.Datasource"

    Set objMyRecordset.ActiveConnection = objMyConn
    objMyRecordset.Open strSQL

    'Copy Data to Excel...
    wksTarget.Range(cellAddressStart).CopyFromRecordset (objMyRecordset)
        
        
    objMyRecordset.Close
    objMyConn.Close
    
    wksTarget.Calculate 'Formeln neu berechnen
        
ErrHandling:
    Application.Calculation = xlCalculationSemiautomatic
    Application.ScreenUpdating = True
    
End Sub

.net 4.7.2 Forms Application referencing a .net Standard 2.0 Lib causes System.Data.SqlClient Reference TypeInitializationException

Exception

System.TypeInitializationException
The type initializer for ‘{AStandardNetLibInYourProject}’ threw an exception.

For example, you have an explicit reference to System.Data.SqlClient > 4.5.0.1, either through a Nuget package reference or an Assembly Reference.

The consumed .net Standard 2.0 Library already contains (internally/transitively) a reference to System.Data.SqlClient 4.5.0.1

That binding causes the explicit reference to conflict.

Solution

  1. Remove all explicit references to System.Data.SqlClient 4.6.0 and 4.6.1 in any consuming project as well as the .net std lib itself.
  2. Add the following <RestoreProjectStyle…/> section to the top of your consuming/main Forms Application Project .csproj file using an Editor:
<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="15.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <Import Project="$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props" Condition="Exists('$(MSBuildExtensionsPath)\$(MSBuildToolsVersion)\Microsoft.Common.props')" />
  <PropertyGroup>
    <RestoreProjectStyle>PackageReference</RestoreProjectStyle>
  </PropertyGroup>
  <PropertyGroup>
    <Configuration Condition=" '$(Configuration)' == '' ">Debug</Configuration>

See also: https://stackoverflow.com/questions/49266384/there-is-possible-to-use-packagereference-in-xamarin-android-or-xamarin-ios-pr/49266856
and: https://github.com/xamarin/xamarin-macios/issues/3949

Possible error details in german

=== Zustandsinformationen vor Bindung ===
LOG: DisplayName = System.Data.SqlClient, Version=4.5.0.1, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
 (Fully-specified)
LOG: Appbase = file:///C:/Workspaces/DWH/DwhApps/Dwh.Cmd.Etl.ImportFile/bin/Debug/
LOG: Ursprünglicher PrivatePath = NULL
Aufruf von Assembly : Dwh.Model, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null.
===
LOG: Diese Bindung startet im default-Load-Kontext.
LOG: Die Anwendungskonfigurationsdatei wird verwendet: C:\Workspaces\DWH\DwhApps\Dwh.Cmd.Etl.ImportFile\bin\Debug\Dwh.Cmd.Etl.ImportFile.exe.Config
LOG: Die Hostkonfigurationsdatei wird verwendet: 
LOG: Die Computerkonfigurationsdatei von C:\Windows\Microsoft.NET\Framework64\v4.0.30319\config\machine.config wird verwendet.
LOG: In der Anwendungskonfigurationsdatei wurde eine Umleitung gefunden. 4.5.0.1 wird nach 4.6.1.0 umgeleitet.
LOG: Verweis nach der Richtlinie: System.Data.SqlClient, Version=4.6.1.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a
LOG: Download von neuem URL file:///C:/Workspaces/DWH/DwhApps/Dwh.Cmd.Etl.ImportFile/bin/Debug/System.Data.SqlClient.DLL.
WRN: Der Vergleich des Assemblynamens führte zum Konflikt: Nebenversion.
ERR: Das Setup der Assembly konnte nicht abgeschlossen werden (hr = 0x80131040). Die Suche wurde beendet.

SSAS: How to Sort parent Hierarchy Members of a Dimension by another Attribute

Scenario: You have a small Dimension containing a Hierarchy. You need all the child members as well as the parent members to be sorted.

Solution excerpt: Add the Sort Attribute by which the Parent Attribute should be ordered by as a new Attribute Relationship to the Dimension and then set the OrderBy property to Attribute Key and the OrderByAttribute to the Sort Attribute

Example

  • Parent Attribute: “PuL
  • Child Attribute: “BwaZeileKey
  • Child Sort Attribute: “Sortierung
  • Parent Sort Attribute: “PulSortierung
Extract of BwaZeileDim which is joined to PulDim that contains the PulSortierung

Result

See how the parent members are sorted by a userdefined key attribute and not by name

SSIS Error: 0xc001000e The connection manager “” is not found.

Solution:

If your .dtsx XML file (search for LogProv) does not contain the tag “DTS:ConfigString=“, you missed to populate the Connection Configuration DropDown in the Logging-Wizard dialog:

<DTS:LogProviders>
    <DTS:LogProvider
      DTS:ConfigString="CON_DWH"
      DTS:CreationName="Microsoft.LogProviderSQLServer"
      DTS:Description="Schreibt Protokolleinträge für Ereignisse in eine SQL Server-Datenbank."
      DTS:DTSID="{9FE46402-8C22-49AB-887C-056C89DD00CE}"
      DTS:ObjectName="SSIS-Protokollanbieter für SQL Server">
      <DTS:ObjectData>
        <InnerObject />
      </DTS:ObjectData>
    </DTS:LogProvider>
  </DTS:LogProviders>

Fehler im OLAP-Speichermodul: Ein doppelter Attributschlüssel wurde bei der Verarbeitung gefunden

Fehler im OLAP-Speichermodul: Ein doppelter Attributschlüssel wurde bei der Verarbeitung gefunden:
Tabelle: TABELLE,
Spalte: SPALTENNAMEINTABELLE,
Wert: 0.
Das Attribut ist ‘ATTRIBUTENAMEINDIMENSION‘.

Mögliche Lösung:

Gehe zu ATTRIBUTENAMEINDIMENSION und entferne die Name-/ValueColumn-Einträge.

Ursache m.M.n: “Bug/Fehlverhalten” in SSAS

Weitere Hilfe, Warnungsfehler in Hierarchien hier:

https://social.msdn.microsoft.com/Forums/sqlserver/de-DE/91e6b21c-e9ad-43ea-8eb8-d6b5f543d27d/ssas-defaulteigenschaften-von-dimensionen-fhren-zu-warnungen

SQL Server on Azure – Performance Guidelines

https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-sql-performance

Info: It is not recommended to use Standard SSDs for Log Files!

In my tests a Standard SSD averages to only 1,1 MB/s when writing a huge amount of data to the Log File of a constant size. This seams to be caused by throtteling of IOs, as sometimes the writing speed goes up to 60 MB/s.

In my scenario I’ve been updating ~1 Mio Rows within in a 4.8 Mio Rows Clustered ColumstoreFact Table and it took 3.5 Hours!

Following a speed test when copying a 14GB file from a Standard SSD to a Premium SSD (L: to M:). You may notice the throttling…:


SQL Server Error Message 15562: The module being executed is not trusted. Either the owner of the database of the module needs to be granted authenticate permission, or the module needs to be digitally signed.

The easiest way to get rid of this error is:

ALTER DATABASE [DBNAME] SET TRUSTWORTHY ON

Deutsche Fehlermeldung:

Das ausgeführte Modul ist nicht vertrauenswürdig. Entweder dem Besitzer der Datenbank des Moduls muss die Authentifizierungsberechtigung erteilt werden oder das Modul muss digital signiert werden.

Who changed a LOGIN (or its Password) and when

USE [master]
GO

/* Test Script:
 
CREATE LOGIN [SQLLogin1] WITH PASSWORD=N'@Very$trongP@ssw0rd123', DEFAULT_DATABASE=[master], 
DEFAULT_LANGUAGE=[us_english], CHECK_EXPIRATION=OFF, CHECK_POLICY=OFF
GO

USE [master]
GO
ALTER LOGIN [SQLLogin1] WITH PASSWORD=N'VerystrongP@ssword123'
GO

*/


WITH alteredLogins
AS
(
	SELECT  l.[Transaction SID]
		,suser_sname(l.[Transaction SID]) AS ChangedByUserName, l.[Begin Time],
		l2.[Lock Information], 
		SUBSTRING(l2.[Lock Information], CHARINDEX('SERVER_PRINCIPAL(principal_id = ', l2.[Lock Information]) + 32, 15) as PrincipalIdString
	FROM ::fn_dblog(DEFAULT, DEFAULT) l			
	INNER JOIN ::fn_dblog(DEFAULT, DEFAULT) l2	ON l.[Transaction ID] = l2.[Transaction ID] 
												AND l2.[Lock Information] LIKE '%SERVER_PRINCIPAL%'
	WHERE l.[Transaction Name] = 'ALTER LOGIN'
),
preResult 
AS
(
	SELECT 	(SELECT name FROM sys.server_principals p		WITH (READUNCOMMITTED)
			 WHERE p.principal_id = CAST(SUBSTRING(PrincipalIdString, 1, CHARINDEX(')', PrincipalIdString) - 1) as int)
			) as Changed_Login_Name,
			CAST(SUBSTRING(PrincipalIdString, 1, CHARINDEX(')', PrincipalIdString) - 1) as int)
			as Changed_Login_Principal_id,
			ChangedByUserName,
			alteredLogins.[Begin Time],
			alteredLogins.[Transaction SID],
			alteredLogins.[Lock Information]
	FROM alteredLogins WITH (READUNCOMMITTED)
)
SELECT * 
FROM preResult	WITH (READUNCOMMITTED)
WHERE Changed_Login_Name = 'SQLLogin1'	/* Search for a specific Login Name for which you want to know who change it (or the password) */
OPTION (RECOMPILE)

This solution only works if the Transaction Log still contains the information of the last change. The previous script returns the following resulting rows:

WhoChangedLogin

 

 

 

Inspired by:

SQL SERVER – Who Changed the Password of SQL Login? – Interview Question of the Week #065

 

New T-SQL Features in SQL Server 2016

You can now set the MAXDOP option on each database differently:

ALTER DATABASE SCOPED CONFIGURATION SET MAXDOP = 1;

Replacement for DBCC INPUTBUFFER:

sys.dm_exec_input_buffer()

Replacement for Trace Flag 1117:

AUTOGROW_SINGLE_FILE and AUTOGROW_ALL_FILES options on FileGroupLevel

Replacement for Trace Flag 1118:

MIXED_PAGE_ALLOCATION

on Database Level ^^.

 

 

See also: http://blogs.sqlsentry.com/team-posts/latest-builds-sql-server-2016/

Combine PerfMon and SQL Server Profiler Data

Found this nice in-depth blog post: https://www.simple-talk.com/sql/database-administration/correlating-sql-server-profiler-with-performance-monitor/

You should use the following measures for perfmon.exe:

<Counter>\Memory\Pages/sec</Counter>
<Counter>\PhysicalDisk(0 C:)\Disk Transfers/sec</Counter>
<Counter>\PhysicalDisk(3 D:)\Disk Transfers/sec</Counter>
<Counter>\PhysicalDisk(5 E:)\Disk Transfers/sec</Counter>
<Counter>\PhysicalDisk(0 C:)\Split IO/Sec</Counter>
<Counter>\PhysicalDisk(3 D:)\Split IO/Sec</Counter>
<Counter>\PhysicalDisk(5 E:)\Split IO/Sec</Counter>
<Counter>\PhysicalDisk(0 C:)\Avg. Disk Queue Length</Counter>
<Counter>\PhysicalDisk(3 D:)\Avg. Disk Queue Length</Counter>
<Counter>\PhysicalDisk(5 E:)\Avg. Disk Queue Length</Counter>
<Counter>\Processor(*)\% Processor Time</Counter>
<Counter>\SQLServer:Buffer Manager\Buffer cache hit ratio</Counter>
<Counter>\SQLServer:Buffer Manager\Page life expectancy</Counter>
<Counter>\SQLServer:Memory Manager\Free Memory (KB)</Counter>
<Counter>\SQLServer:Memory Manager\Total Server Memory (KB)</Counter>

If you have problems with perfmon, that your collector is in a “compiling” status and you want to terminate that Data Collector set:

  1. Write down the name of the troublesome set
  2. Close all of your PerfMon windows
  3. Open RegEdit and goto “HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Schedule\TaskCache\Tree\Microsoft\Windows\PLA”
  4. Delete the SubFolder containing your Data Collector set name
  5. Re-Open perfmon.exe – the Data Collector Set is gone

Tested on a Windows Server 2012 R2.

Extracted from http://kb.itimpulse.in/?p=147

SQL Server Trace Flags

Trace Flag 1117

Should always be enabled, if you’re using more than one tempdb data file (which you usually should do :-)).

Trace Flag 1118

Should be enabled in OLTP environments to avoid contention (SGAM) by disabling mixed extent allocations in ALL databases. If enabled, the RDBMS will always allocate a new 64KB extent instead of trying to find a mixed one, in which smaller data might be stuffed. See also Book: SQL Server 2012 Internals and Troubleshooting pg 228.

Should also be enabled when using Snapshot isolation, because Snapshots are stored in tempdb. See also pg 332 and http://support.microsoft.com/kb/2154845

Downside: You may use more space (ok, not really, as long as you don’t have millions of users creating 10KB chunks…).

 

 

Supported Data Sources of SSAS Multidimensional vs. Tabular vs. Excel Power Pivot

Supported Data Sources

Source and Version Multidimensional Tabular Excel Power Pivot
2008 R2 2012 2014 2016 2012 2014 2016 2016 Direct Query
SQL Server 7.0 / 2000 yes no  no no no no no no no
SQL Server 2005 yes yes yes (no)? yes yes (no)? (no)? yes
SQL Server 2008 – 2012 yes yes yes yes yes yes yes yes yes
SQL Server 2014 ? (yes)* yes yes (yes)* (yes)* yes yes (yes)*
SQL Server 2016 ? (yes)* (yes)* yes (yes)* (yes)* yes yes (yes)*
SQL Azure Database yes yes yes yes yes yes yes yes yes
SQL Server Parallel Data Warehouse (PDW) / Analytics Platform System (APS) yes yes  yes yes yes yes yes yes yes
Oracle relational 9i, 10g, 11g yes (9.0) yes yes yes yes yes yes yes yes
Oracle relational 12g ? ? ? yes ? ? yes yes ?
Teradata relational V2R6, V12 yes (v2R6) yes yes  yes yes yes yes yes yes
IBM DB2 relational V8.1 yes (with EE only) yes yes yes yes yes yes no yes
Informix relational V11.10 no yes yes yes yes yes yes no yes
Sybase relational no yes yes yes yes yes yes no yes
Text Files (.CSV, .TAB, .TXT) no no no no yes yes yes no yes
Microsoft Excel 97-2007 (.xlsx, .xlsm, xlsb, xltx, xltm) no no  no  no  yes yes  (no)? no yes
Microsoft Excel 2010 (.xlsx, .xlsm, xlsb, xltx, xltm) no no  no  no yes yes yes no yes
Microsoft Excel 2013-2016 (.xlsx, .xlsm, xlsb, xltx, xltm) no no no no (no)? (no)? yes no ?
Excel documents in SharePoint no no no no  no  no no no no
Microsoft Access 2003 yes (x86 only) yes ? ? yes yes no no yes
Microsoft Access 2007 yes (x86 only) yes yes ? yes yes no no yes
Microsoft Access 2010 ? yes yes yes yes yes yes no yes
Microsoft Access 2013 ? ? yes yes (no)? ? yes no ?
Microsoft Access 2016 ? ? ? yes (no)? ? yes no ?
Access documents in SharePoint no no  no no   no no no no no
Power Pivot Workbook published to SharePoint no no no no  yes yes yes no yes
SSAS Analysis Services Cube (2005) no no no no yes yes (no)? no yes
SSAS Analysis Services Cube (2008-2008R2) no no no no  yes yes yes no yes
SSAS Analysis Services Cube (2012-2014) no no  no no  (yes)* (yes)* yes no (yes)*
SSAS Analysis Services Cube (2016) no no  no no  ? ? yes no (yes)*
Data Feeds Atom 1.0 format (exposed via WCF Data Service) (.atomsvc or .atom) no no  no no  yes yes yes no yes
Office Database Connection files no no no no  yes yes yes no yes

 

Add KPI (Key Performance Indicator) based on Delivery Days between two Dates to Tabular Project

You have two Dates in your Tabular Project

KPI your two date columns OrderDate and ShipDate

 

 

  1. Create a new calculated column:
    KPI Add/Insert a new column

Give it the name “DeliveryDays“.

2. Enter following formula:

=IF(isblank([OrderDate]);0;IF([ShipDate]<[OrderDate];0;[ShipDate]-[OrderDate]))*1

KPI Enter Formula

 

 

 

3. Go to the measure grid (bottom) and enter a new measure to the previously created column:

Average of DeliveryDays:=AVERAGE([DeliveryDays])

KPI Create Average Measure

KPI Created Average Measure in Measure Grid

 

 

 

 

 

4. Create KPI by right clicking on the newly created Average measure:

KPI Create KPI Right Mouse Click on created measure

 

 

 

 

 

5. A new dialog will pop up and you have to define the KPI details:

Create KPI (Key Performance Indicator) Dialog

 

 

 

 

 

 

After hitting OK you will notice a KPI Indicator Symbol in your measure grid:

KPI Indicator in measure grid after closing Dialog

 

 

6. Analyze the Results in Excel:

KPIs in Excel Pivot Table

 

 

 

 

 

 

This was a quick tutorial of how to create a KPI using a new calculated column in a Tabular Project using SSDT Visual Studio 2013.

BUG SSDT-BI Visual Studio The operation could not be completed. No such interface supported. (when saving project)

Problem

You retrieve the Exception:

Microsoft Visual Studio

The operation could not be completed. No such interface supported.

Exception

Solution

Go to Visual Studio Menu TOOLS | Options…

Under Project and Solutions | General check the Option to Save a Project during creation which is by default unchecked:

Solution

This workaround is only a solution to new projects. For your current project you may go to the temporary folder (Right Click on your Project and Open Path in Explorer) and copy the contents of your project to another place.

Data Warehouse Semantics Basics

Surrogate Keys

Surrogate Keys or IDs uniquely identify a record in a data set (table). Typically they (should) consist of only one column. Best example: ID column that is automatically incremented.

A Surrogate Key is also known as Meaningless Key.

Surrogate Keys should be used for internal purposes inside a relational database. They should be hidden to end users. To display unique identifiers to end users you may want to introduce an Alternate Key.

Surrogate Keys are “artificial” or a “substitute“. Some folks say a Surrogate Key is always a substitute for something and if there is nothing to substitue they cannot use the term Surrogate Key. I encourage you to be as precise as necessary.

Alternate Keys

Alternate Keys usually are a readable identifier for humans. You may want to include Alternate Keys if you need to display readable identifiers for a record to end users. It also may contain certain “meaning” in its value. This might be the case if it is derived from multiple columns of a record.

Alternate Keys may be artificial but are not limited to this. For example: A book record may contain an internal ID. It may also carry a ISBN number that may uniquely identify this book. Another example could be a product that has a Product No used in publications and order systems. This column may also be used as an Alternate Key column.

Natural Keys

Natural Keys are columns that out-of-nature create a unique identifier for a record. They contain readable text. Natural Keys may be created across multiple columns.  Examples for a natural key are an ISBN number, a social insurance number or the tuple {FirstName, LastName, Address, BirthDate}.

Natural Keys are fine and you may want to use them as a basis for your Alternate Keys for display purposes.

But !NEVER! use Natural Keys as the Primary Key column to create relations between tables!! Many years ago I worked in a fashion company in which they used readable product numbers (consisting of a category and other stuff) as Primary Keys and after 25 years of business the numbers went out. Nice work followed :-).

ID Column

An ID or Identifier column is usually a non-readable internal column used to uniquely identify a record and to create relations between tables.

I recommend naming an Identifier column with only the two letters ID. Following the DRY principle you should not repeat the table name in all columns. Sticking to this principle will also make your queries easier to read. The ID column will always be addressed by a prefix table like myTableName.ID.

Foreign Key column is always named following the schema: Table name + ID.

If your Product table has its own ID column and a ProductCategoryID column then it is easy to understand what might be the PK (*) and what is a FK to another table.

Imagine you have a large table with 20 Foreign Keys to other tables and its own ID column. It’s easy using Intellisense to distinguish the ID column from all the other Foreign Keys without having to find out the name of the column that is the ID column of that table (as this must not be the Primary Key and therefore marked with a yellow key symbol).

I encourage you to use the term ID in relational databases and omitting the term Key from being used in an OLTP system.

You may use the term Key as a substitue for the term ID in Data Warehouse scenarios. But please decide on which term you exclusively want to use.

Don’t start mixing Key with ID columns in the same database!

IDs, Surrogate Key, Alternate Keys

Conformed Dimension

 

Non-Conformed Dimension

 

Shared Dimension

 

Normalized Dimension

 

Denormalized Dimension

 

Degenerated Dimension

 

Additive Measures

 

Semi-additive Measures

 

Non-additive Measures

 

 

 

Analysis Services Tabular Project Dialogs in SSDT-BI

About

This tutorial gives you an overview of common dialogs and windows used to configure Analysis Services Tabular Projects.

Initial Dialogs

Create a New Project

New Analysis Services Tabular Project Dialog

The created file will be named TabularProject1.smproj.

Tabular Model Designer (Choose a Connection)

Tabular Model Designer

Further information on Compatibility Levels MSDN here.

Model.bim Properties window

Model.bim Properties

  • Build Action: Compile | None
  • Copy To Output Directory: Do not copy | Copy always | Copy if newer
  • Data Backup: Back up to disk | Do not back up to disk

If enabled: An ABF file is automatically created when a BIM file is saved. For further information on how to a restore a .abf file see mssqltips here. For contents of the backuped .abf file see MSDN here.

  • Direct Query Mode: On | Off

Benefits if enabled: Data will always be up-to-date. Data sets can be huge (larger than RAM of the SSAS instance). May be faster due to query acceleration in the data source (xVelocity column indexes for example). Security can be enforced (for example row-level security in RDBMS database). Complex formulas might be faster if the back-end can optimize the query plan. Restrictions: All your data in the Tabular Model must come from a single relational data source (either SQL Server, Oracle or Teradata). You must not use data from other sources. If you need data from elsewhere you first have to create a DWH that contains all data. Then you can use this DWH as the only source for your Tabular Model. Excel in DirectQuery mode: MDX queries from Excel to a Tabular Model is only supported in SQL Server 2016 (or above) and the compatibility level is set to 1200. For further information see MSDN here.

  • Workspace Retention: Keep in memory | Unload from memory | Delete workspace

Keep in memory: The workspace database (that includes the model metadata, imported data and credentials) will be kept in memory after closing a model. Your model will load faster but you might lose all of your memory. Unload from memory: The workspace database will be written to disk on closing and reloaded into memory on reload. Should be used on a remote workspace database or if you have a limited RAM amount. Delete workspace: The workspace database will be deleted on closing your model. Your model will always need more time to load as it must restore any settings. Use this option when you are rarely working with your model.

  • Workspace Server: This is the connection to a SSAS Tabular Mode Instance used for storing and editing temporary the current in-memory model. See Workspace Database property to examine the temporary database name created on this instance.

Import a new Data Model from Data Source

This section describes the steps to import related Fact and Dimension tables from Adventureworks DW 2014. Go to the menu bar in SSDT, click on MODEL and then Import From Data Source…

Import from Data Source Menu Item

The Table Import Wizard opens.

Table Import Wizard Step 1

Select the relational data source instance and Database to connect to.

Table Import Wizard Step 1

Table Import Wizard Step 2

Enter the credentials of a user specific to this project to connect to the data source. You should always try to create an Active Directory user for each Project you are creating to be able to give project specific permissions to this user in the data source. Don’t use a “sa” account.
Table Import Wizard Step 2

Table Import Wizard Step 3

Decide whether to select data from given tables or views or by specifying your own SQL Query.

Table Import Wizard Step 3

Table Import Wizard Step 4 (Select Tables and Views)

Select the Tables/Views. In this example I checked the FactInternetSales table and then used the handy Select Related Tables button.

Table Import Wizard Step 4

Preview & Filter dialog:

Table Import Wizard Data Preview

Table Import Wizard Step 5 (Finishing -> Importing)

The Data is transferred and the selected tables are imported into your model. Hit Close.

Table Import Wizard Final Step

Configure your Tabular Model

When you switch to Diagram View, you will see the following result:

Diagram View of FactInternetSales Tabular Model

Table Properties

Table Properties

  • Default Field Set: Determines which attributes are being displayed by default when a client connects to this table. See also MSDN here.
  • Hidden: True | False. Specifies if the table is hidden in client tools.
  • Partitions: Lets you define logical partitions for single processing. Can be very helpful on huge tables. They allow parallel processing. See also MSDN here.

Partition Manager

  • Source Data: View or edit the source data.

Source Data

  • Table Behavior: Only affects PowerView. Does not apply to Excel Pivot! You can define table behavior properties for tables of primary interest (for example: Customer, Employee, …). In most cases you do not need to set properties for these table types: Date, Category, Group, Department, Lookup and Summary tables… . For details and example of table behavior see MSDN here.

Table Behavior of DimCustomer

Column Properties

For an (currently in-)complete property overview see MSDN here.

Column Properties

  • Column Name: Name of the column stored in the model. Also referred to as Friendly Name. This property value is the name of a column displayed in client reporting tools.
  • Data Category: One of the following values:

Data Category Column Properties

  • Data Format: Lets you define the displayed format of this column in client tools. Options follow (Numeric and Date based columns). See also MSDN here.

Data Format NumericData Format Date

 

 

 

 

 

  • Data Type: Underlying Data Type used on Import. Do not change.
  • Default Label: True | False. Set to true to set a specific column as the Label column displayed in client tools for this table.
  • Hidden: True | False: Set to true to hide this column in client reporting tools.
  • Row Identifier: True | False: Set to true to set a specific column as the unique ID column. You will use a ID or surrogate key column.
  • Sort By Column: Specify another column to sort values in this column. You cannot specify a formula or measure.
  • Summarize By: By default a client reporting tool uses the SUM aggregation on values. If you need to set a specific calculation method use one of the following:

Summarize By Column Properties

  • Table Detail Position: If you have a Default Field Set defined, you may change the order of a column by setting this value.

Creating a Hierarchy

We will create a Hierarchy on the Date Dimension as follows. First you will have to click the “Create Hierarchy” button on the upper right of a table:

Create Hierarchy

Then you will give it a name: “Years and Parts”. After that you drag & drop the following three columns onto the Years and Parts hierarchy column:

Drag & Drop Columns on Hierarchy column

Analyze Results with Excel Pivot

We will now analyze the results and the effects of the created hierarchy in Excel. Go to the menu bar in VS2013, open the MODEL menu and click Analyze in Excel.

Analyze in Excel Menu Item

Leave the “Current Windows User” option set in the following dialog and hit OK.

Analyze in Excel Start Screen

Excel opens with the following result:

Excel starts with empty pivot table

 

Picking the right SQL Server Data Tools (SSDT) Version for SSAS

You have SQL Server 2014 or SQL Server 2012 (SSAS/SSRS/SSIS)

You have SQL Server 2016 CTP (SSAS/SSRS/SSIS)

All of these SSDTs also support SQL Server 2008 and 2008 R2.

If you only have SQL Server 2008 or SQL Server 2008 R2

  • install Business Intelligence Development Studio (BIDS) via SQL Server 2008/R2 Setup

 

If you picked the SSDT-BI for VS2013 version, the installation will look like this:

SSDT-BI for VS2013 on SQL2014 Setup

Off Topic: All about good Password Hashing

From: http://security.stackexchange.com/a/31846

Thank you for this post, Thomas Pornin.

The Theory

We need to hash passwords as a second line of defence. A server which can authenticate users necessarily contains, somewhere in its entrails, some data which can be used to validate a password. A very simple system would just store the passwords themselves, and validation would be a simple comparison. But if a hostile outsider were to gain a simple glimpse at the contents of the file or database table which contains the passwords, then that attacker would learn a lot. Unfortunately, such partial, read-only breaches do occur in practice (a mislaid backup tape, a decommissioned but not wiped-out hard disk, an aftermath of a SQL injection attack — the possibilities are numerous). See this blog post for a detailed discussion.

Since the overall contents of a server that can validate passwords are necessarily sufficient to indeed validate passwords, an attacker who obtained a read-only snapshot of the server is in position to make an offline dictionary attack: he tries potential passwords until a match is found. This is unavoidable. So we want to make that kind of attack as hard as possible. Our tools are the following:

  • Cryptographic hash functions: these are fascinating mathematical objects which everybody can compute efficiently, and yet nobody knows how to invert them. This looks good for our problem – the server could store a hash of a password; when presented with a putative password, the server just has to hash it to see if it gets the same value; and yet, knowing the hash does not reveal the password itself.
  • Salts: among the advantages of the attacker over the defender is parallelism. The attacker usually grabs a whole list of hashed passwords, and is interested in breaking as many of them as possible. He may try to attack several in parallels. For instance, the attacker may consider one potential password, hash it, and then compare the value with 100 hashed passwords; this means that the attacker shares the cost of hashing over several attacked passwords. A similar optimisation is precomputed tables, including rainbow tables; this is still parallelism, with a space-time change of coordinates.The common characteristic of all attacks which use parallelism is that they work over several passwords which were processed with the exact same hash function. Salting is about using not one hash function, but a lot of distinct hash functions; ideally, each instance of password hashing should use its own hash function. A salt is a way to select a specific hash function among a big family of hash functions. Properly applied salts will completely thwart parallel attacks (including rainbow tables).
  • Slowness: computers become faster over time (Gordon Moore, co-founder of Intel, theorized it in his famous law). Human brains do not. This means that attackers can “try” more and more potential passwords as years pass, while users cannot remember more and more complex passwords (or flatly refuse to). To counter that trend, we can make hashing inherently slow by defining the hash function to use a lot of internal iterations (thousands, possibly millions).

We have a few standard cryptographic hash functions; the most famous are MD5 and the SHA family. Building a secure hash function out of elementary operations is far from easy. When cryptographers want to do that, they think hard, then harder, and organize a tournament where the functions fight each other fiercely. When hundreds of cryptographers gnawed and scraped and punched at a function for several years and found nothing bad to say about it, then they begin to admit that maybe that specific function could be considered as more or less secure. This is just what happened in the SHA-3 competition. We have to use this way of designing hash function because we know no better way. Mathematically, we do not know if secure hash functions actually exist; we just have “candidates” (that’s the difference between “it cannot be broken” and “nobody in the world knows how to break it”).

A basic hash function, even if secure as a hash function, is not appropriate for password hashing, because:

  • it is unsalted, allowing for parallel attacks (rainbow tables for MD5 or SHA-1 can be obtained for free, you do not even need to recompute them yourself);
  • it is way too fast, and gets faster with technological advances. With a recent GPU (i.e. off-the-shelf consumer product which everybody can buy), hashing rate is counted in billions of passwords per second.

So we need something better. It so happens that slapping together a hash function and a salt, and iterating it, is not easier to do than designing a hash function — at least, if you want the result to be secure. There again, you have to rely on standard constructions which have survived the continuous onslaught of vindicative cryptographers.

Good Password Hashing Functions

PBKDF2

PBKDF2 comes from PKCS#5. It is parameterized with an iteration count (an integer, at least 1, no upper limit), a salt (an arbitrary sequence of bytes, no constraint on length), a required output length (PBKDF2 can generate an output of configurable length), and an “underlying PRF”. In practice, PBKDF2 is always used with HMAC, which is itself a construction built over an underlying hash function. So when we say “PBKDF2 with SHA-1”, we actually mean “PBKDF2 with HMAC with SHA-1”.

Advantages of PBKDF2:

  • Has been specified for a long time, seems unscathed for now.
  • Is already implemented in various framework (e.g. it is provided with .NET).
  • Highly configurable (although some implementations do not let you choose the hash function, e.g. the one in .NET is for SHA-1 only).
  • Received NIST blessings (modulo the difference between hashing and key derivation; see later on).
  • Configurable output length (again, see later on).

Drawbacks of PBKDF2:

  • CPU-intensive only, thus amenable to high optimization with GPU (the defender is a basic server which does generic things, i.e. a PC, but the attacker can spend his budget on more specialized hardware, which will give him an edge).
  • You still have to manage the parameters yourself (salt generation and storage, iteration count encoding…). There is a standard encoding for PBKDF2 parameters but it uses ASN.1 so most people will avoid it if they can (ASN.1 can be tricky to handle for the non-expert).

bcrypt

bcrypt was designed by reusing and expanding elements of a block cipher called Blowfish. The iteration count is a power of two, which is a tad less configurable than PBKDF2, but sufficiently so nevertheless. This is the core password hashing mechanism in the OpenBSD operating system.

Advantages of bcrypt:

  • Many available implementations in various languages (see the links at the end of the Wikipedia page).
  • More resilient to GPU; this is due to details of its internal design. The bcrypt authors made it so voluntarily: they reused Blowfish because Blowfish was based on an internal RAM table which is constantly accessed and modified throughout the processing. This makes life much harder for whoever wants to speed up bcrypt with a GPU (GPU are not good at making a lot of memory accesses in parallel). See here for some discussion.
  • Standard output encoding which includes the salt, the iteration count and the output as one simple to store character string of printable characters.

Drawbacks of bcrypt:

  • Output size is fixed: 192 bits.
  • While bcrypt is good at thwarting GPU, it can still be thoroughly optimized with FPGA: modern FPGA chips have a lot of small embedded RAM blocks which are very convenient for running many bcrypt implementations in parallel within one chip. It has been done.
  • Input password size is limited to 51 characters. In order to handle longer passwords, one has to combine bcrypt with a hash function (you hash the password and then use the hash value as the “password” for bcrypt). Combining cryptographic primitives is known to be dangerous (see above) so such games cannot be recommended on a general basis.

scrypt

scrypt is a much newer construction (designed in 2009) which builds over PBKDF2 and a stream cipher called Salsa20/8, but these are just tools around the core strength of scrypt, which is RAM. scrypt has been designed to inherently use a lot of RAM (it generates some pseudo-random bytes, then repeatedly read them in a pseudo-random sequence). “Lots of RAM” is something which is hard to make parallel. A basic PC is good at RAM access, and will not try to read dozens of unrelated RAM bytes simultaneously. An attacker with a GPU or a FPGA will want to do that, and will find it difficult.

Advantages of scrypt:

  • A PC, i.e. exactly what the defender will use when hashing passwords, is the most efficient platform (or close enough) for computing scrypt. The attacker no longer gets a boost by spending his dollars on GPU or FPGA.
  • One more way to tune the function: memory size.

Drawbacks of scrypt:

  • Still new (my own rule of thumb is to wait at least 5 years of general exposure, so no scrypt for production until 2014 – but, of course, it is best if other people try scrypt in production, because this gives extra exposure).
  • Not as many available, ready-to-use implementations for various languages.
  • Unclear whether the CPU / RAM mix is optimal. For each of the pseudo-random RAM accesses, scrypt still computes a hash function. A cache miss will be about 200 clock cycles, one SHA-256 invocation is close to 1000. There may be room for improvement here.
  • Yet another parameter to configure: memory size.

OpenPGP Iterated And Salted S2K

I cite this one because you will use it if you do password-based file encryption with GnuPG. That tool follows the OpenPGP format which defines its own password hashing functions, called “Simple S2K”, “Salted S2K” and “Iterated and Salted S2K“. Only the third one can be deemed “good” in the context of this answer. It is defined as the hash of a very long string (configurable, up to about 65 megabytes) consisting of the repetition of an 8-byte salt and the password.

As far as these things go, OpenPGP’s Iterated And Salted S2K is decent; it can be considered as similar to PBKDF2, with less configurability. You will very rarely encounter it outside of OpenPGP, as a stand-alone function.

Unix “crypt”

Recent Unix-like systems (e.g. Linux), for validating user passwords, use iterated and salted variants of the crypt() function based on good hash functions, with thousands of iterations. This is reasonably good. Some systems can also use bcrypt, which is better.

The old crypt() function, based on the DES block cipher, is not good enough:

  • It is slow in software but fast in hardware, and can be made fast in software too but only when computing several instances in parallel (technique known as SWAR or “bitslicing”). Thus, the attacker is at an advantage.
  • It is still quite fast, with only 25 iterations.
  • It has a 12-bit salt, which means that salt reuse will occur quite often.
  • It truncates passwords to 8 characters (characters beyond the eighth are ignored) and it also drops the upper bit of each character (so you are more or less stuck with ASCII).

But the more recent variants, which are active by default, will be fine.

Bad Password Hashing Functions

About everything else, in particular virtually every homemade method that people relentlessly invent.

For some reason, many developers insist on designing function themselves, and seem to assume that “secure cryptographic design” means “throw together every kind of cryptographic or non-cryptographic operation that can be thought of”. See this question for an example. The underlying principle seems to be that the sheer complexity of the resulting utterly tangled mess of instruction will befuddle attackers. In practice, though, the developer himself will be more confused by his own creation than the attacker.

Complexity is bad. Homemade is bad. New is bad. If you remember that, you’ll avoid 99% of problems related to password hashing, or cryptography, or even security in general.

Password hashing in Windows operating systems used to be mindbogglingly awful and now is just terrible (unsalted, non-iterated MD4).

Key Derivation

Up to now, we considered the question of hashing passwords. A close problem is about transforming a password into a symmetric key which can be used for encryption; this is called key derivation and is the first thing you do when you “encrypt a file with a password”.

It is possible to make contrived examples of password hashing functions which are secure for the purpose of storing a password validation token, but terrible when it comes to generating symmetric keys; and the converse is equally possible. But these examples are very “artificial”. For practical functions like the one described above:

  • The output of a password hashing function is acceptable as a symmetric key, after possible truncation to the required size.
  • A Key Derivation Function can serve as a password hashing function as long as the “derived key” is long enough to avoid “generic preimages” (the attacker is just lucky and finds a password which yields the same output). An output of more than 100 bits or so will be enough.

Indeed, PBKDF2 and scrypt are KDF, not password hashing function — and NIST “approves” of PBKDF2 as a KDF, not explicitly as a password hasher (but it is possible, with only a very minute amount of hypocrisy, to read NIST’s prose in such a way that it seems to say that PBKDF2 is good for hashing passwords).

Conversely, bcrypt is really a block cipher (the bulk of the password processing is the “key schedule”) which is then used in CTR mode to produce three blocks (i.e. 192 bits) of pseudo-random output, making it a kind of hash function. bcrypt can be turned into a KDF with a little surgery, by using the block cipher in CTR mode for more blocks. But, as usual, we cannot recommend such homemade transforms. Fortunately, 192 bits are already more than enough for most purposes (e.g. symmetric encryption with GCM or EAX only needs a 128-bit key).

Miscellaneous Topics

How many iterations ?

As much as possible ! This salted-and-slow hashing is an arms race between the attacker and the defender. You use many iterations to make the hashing of a password harder for everybody. To improve security, you should set that number as high as you can tolerate on your server, given the tasks that your server must otherwise fulfill. Higher is better.

Collisions and MD5

MD5 is broken: it is computationally easy to find a lot of pairs of distinct inputs which hash to the same value. These are called collisions.

However, collisions are not an issue for password hashing. Password hashing requires the hash function to be resistant to preimages, not to collisions. Collisions are about finding pairs of messages which give the same output without restriction, whereas in password hashing the attacker must find a message which yields a given output that the attacker does not get to choose. This is quite different. As far as we known, MD5 is still (almost) as strong as it has ever been with regards to preimages (there is a theoretical attack which is still very far in the ludicrously impossible to run in practice).

The real problem with MD5 as it is commonly used in password hashing is that it is very fast, and unsalted. However, PBKDF2 used with MD5 would be robust. You should still use SHA-1 or SHA-256 with PBKDF2, but for Public Relations. People get nervous when they hear “MD5”.

Salt Generation

The main and only point of the salt is to be as unique as possible. Whenever a salt value is reused anywhere, this has the potential to help the attacker.

For instance, if you use the user name as salt, then an attacker (or several colluding attackers) could find it worthwhile to build rainbow tables which attack the password hashing function when the salt is “admin” (or “root” or “joe”) because there will be several, possibly many sites around the world which will have a user named “admin”. Similarly, when a user changes his password, he usually keeps his name, leading to salt reuse. Old passwords are valuable targets, because users have the habit of reusing passwords in several places (that’s known to be a bad idea, and advertised as such, but they will do it nonetheless because it makes their life easier), and also because people tend to generate their passwords “in sequence”: if you learn that Bob’s old password is “SuperSecretPassword37”, then Bob’s current password is probable “SuperSecretPassword38” or “SuperSecretPassword39”.

The cheap way to obtain uniqueness is to use randomness. If you generate your salt as a sequence of random bytes from the cryptographically secure PRNG that your operating system offers (/dev/urandom, CryptGenRandom()…) then you will get salt values which will be “unique with a sufficiently high probability”. 16 bytes are enough so that you will never see a salt collision in your life, which is overkill but simple enough.

UUID are a standard way of generating “unique” values. Note that “version 4” UUID just use randomness (122 random bits), like explained above. A lot of programming frameworks offer simple to use functions to generate UUID on demand, and they can be used as salts.

Salt Secrecy

Salts are not meant to be secret; otherwise we would call them keys. You do not need to make salts public, but if you have to make them public (e.g. to support client-side hashing), then don’t worry too much about it. Salts are there for uniqueness. Strictly speaking, the salt is nothing more than the selection of a specific hash function within a big family of functions.

“Pepper”

Cryptographers can never let a metaphor alone; they must extend it with further analogies and bad puns. “Peppering” is about using a secret salt, i.e. a key. If you use a “pepper” in your password hashing function, then you are switching to a quite different kind of cryptographic algorithm; namely, you are computing a Message Authentication Code over the password. The MAC key is your “pepper”.

Peppering makes sense if you can have a secret key which the attacker will not be able to read. Remember that we use password hashing because we consider that an attacker could grab a copy of the server database, or possible of the whole disk of the server. A typical scenario would be a server with two disks in RAID 1. One disk fails (electronic board fries – this happens a lot). The sysadmin replaces the disk, the mirror is rebuilt, no data is lost due to the magic of RAID 1. Since the old disk is dysfunctional, the sysadmin cannot easily wipe its contents. He just discards the disk. The attacker searches through the garbage bags, retrieves the disk, replaces the board, and lo! He has a complete image of the whole server system, including database, configuration files, binaries, operating system… the full monty, as the British say. For peppering to be really applicable, you need to be in a special setup where there is something more than a PC with disks; you need a HSM. HSM are very expensive, both in hardware and in operational procedure. But with a HSM, you can just use a secret “pepper” and process passwords with a simple HMAC (e.g. with SHA-1 or SHA-256). This will be vastly more efficient than bcrypt/PBKDF2/scrypt and their cumbersome iterations. Also, usage of a HSM will look extremely professional when doing a WebTrust audit.

Client-side hashing

Since hashing is (deliberately) expensive, it could make sense, in a client-server situation, to harness the CPU of the connecting clients. After all, when 100 clients connect to a single server, the clients collectively have a lot more muscle than the server.

To perform client-side hashing, the communication protocol must be enhanced to support sending the salt back to the client. This implies an extra round-trip, when compared to the simple client-sends-password-to-server protocol. This may or may not be easy to add to your specific case.

Client-side hashing is difficult in a Web context because the client uses Javascript, which is quite anemic for CPU-intensive tasks.

In the context of SRP, password hashing necessarily occurs on the client side.

Conclusion

Use bcrypt. PBKDF2 is not bad either. If you use scrypt you will be a “slightly early adopter” with the risks that are implied by this expression; but it would be a good move for scientific progress (“crash dummy” is a very honourable profession).

Embracing your Transact-SQL Code and Coding Tips.