On-Premise Python Code to Load Data to Microsoft Fabric Warehouse using Token based Authentication

Amit Chandak
5 min readJul 13, 2023

--

This blog post includes the Python script to load file data from an on-premises SQL Server to Microsoft Fabric Warehouse. The data is pushed using an on-premise Python script with MFA authentication using an authentication token(OAuth). The Token has been generated using Azure App. The app is registered on portal.azure.com. The append incremental/delta load method has been used to push new data.

You can download the below CSV files for testing

https://github.com/amitchandakpbi/powerbi/raw/main/csv/

The video:

The SQL code: Create Table

CREATE TABLE [dbo].[sales_py2](
[Order_No] [int] ,
[Item_ID] [int] ,
[Sales_Date] [datetime2](6) ,
[Customer_Id] [int] ,
[City_Id] [int] ,
[Qty] [decimal](18, 10) ,
[Price] [decimal](18, 10) ,
[COGS] [decimal](18, 10) ,
[Discount_Percent] [decimal](18, 10)
)

The Python code

import pandas as pd
import sqlalchemy as sa
import urllib
import struct
import requests
import json

def get_access_token(app_id,client_secret,directory_id,user_name,password):
"""It will create a access token to access the mail apis"""
app_id = app_id #Application Id - on the azure app overview page
client_secret = client_secret
directory_id = directory_id
token_url = "https://login.microsoftonline.com/"+directory_id+"/oauth2/token"
token_data = {
"grant_type": "password",
"client_id": app_id,
"client_secret": client_secret,
"resource": "https://storage.azure.com/",
"scope":"https://graph.microsoft.com",
"username": user_name,
"password": password,
}
token_headers={
"Content-Type":"application/x-www-form-urlencoded"
}
print(token_url)
token_response = requests.post(token_url,data=token_data,headers=token_headers)
token_response_dict = json.loads(token_response.text)
error_exists = checkKey(token_response_dict,"error")
#print(token_response.text)
if error_exists == True :
raise Exception("Error in getting in access token")
else:
token = token_response_dict.get("access_token")
refresh_token = token_response_dict.get("refresh_token")
if token == None :
print("Unable to get access token")
print(str(token_response_dict))
raise Exception("Error in getting in access token")
elif refresh_token == None :
print("Unable to get refresh token")
print(str(token_response_dict))
raise Exception("Error in getting in refresh token")
else:
return token

def checkKey(dict, key):
"""It will check the given key exists in the response dictionary and return True or False"""
if key in dict.keys():
# print(dict[key])
return True
else:
return False
def checkKey(dict, key):
"""It will check the given key exists in the response dictionary and return True or False"""
if key in dict.keys():
# print(dict[key])
return True
else:
return False



print("SQL Code is called")
# get access token
access_token = get_access_token("<Application id>","<client_secret>","<directory_id>","<user_name>","<password>")

print(access_token)

# Load Table for first time
server = '<URL>'
database = 'Warehouse2'
driver='ODBC Driver 17 for SQL Server'
user= '<User>'
pass1='<pass>'


# Create a connection string for SQL Server using Windows authentication
connection_string = f'mssql+pyodbc://{user}:{pass1}@{server}/{database}?driver={driver}&Trusted_Connection=no&Authentication=ActiveDirectoryPassword&Token={access_token}'
print(connection_string)
engine = sa.create_engine(connection_string, echo=False)

sql_query = "SELECT coalesce(max(Sales_Date) ,'2018-01-01') _date FROM dbo.sales_py2"
df = pd.read_sql(sql_query, engine)

max_date = df['_date'].iloc[0]
print(max_date)


# Create the SQL query with the max_date value
query1 = f"SELECT * FROM dbo.sales_fact WHERE sales_date > '{max_date}' "

server1 = 'localhost\\SQLEXPRESS'
database1 = 'sales'

# Create a connection string for SQL Server using Windows authentication
connection_string = f'mssql+pyodbc://{server1}/{database1}?driver=SQL+Server&Trusted_Connection=yes'

# Create an engine using the connection string
engine1 = sa.create_engine(connection_string)

# Run a SQL query to get a Pandas DataFrame

df = pd.read_sql(query1, engine1)

# Print the DataFrame
#print(df)


server = '<URL>'
database = 'Warehouse2'
driver='ODBC Driver 17 for SQL Server'
user= '<User>'
pass1='<pass>'

# Create a connection string for SQL Server using Windows authentication
connection_string = f'mssql+pyodbc://{user}:{pass1}@{server}/{database}?driver={driver}&Trusted_Connection=no&Authentication=ActiveDirectoryPassword&Token={access_token}'
print(connection_string)
# Create an engine using the connection string
engine = sa.create_engine(connection_string, echo=False, connect_args={'autocommit': True}, fast_executemany=True)

# Run a SQL query to get a Pandas DataFrame
#sql_query = "SELECT TOP(10) * FROM [dbo].[Customer]"
df.to_sql(name='sales_py2', con=engine, if_exists='append',dtype={
'Order_No': sa.Integer() ,
'Item_ID': sa.Integer(),
'Sales_Date': sa.String(50) ,
'Customer_Id': sa.Integer(),
'City_Id': sa.Integer() ,
'Qty': sa.Integer(),
'Price': sa.Float(),
'COGS': sa.Float() ,
'Discount_Percent': sa.Float()}, index = False, chunksize=100, method='multi')

I hope you are able to use this code with the video.

More on Microsoft Fabric:

Comprehensive Free Course for Your Team to Learn Microsoft Power BI, SQL, and Data Factory

Microsoft Fabric: A Brief Introduction

How to enable Microsoft Fabric; on Power BI Service| Start the Fabric (Preview) trial

Microsoft Fabric-What it is, Why Use, Who Should, When and How to use

Microsoft Fabric: Various Components| Microsoft Fabric Create Workspace, Lakehouse, Warehouse

Microsoft Fabric: How to load data in Lakehouse using Dataflow Gen2

Microsoft Fabric: How to load data in Warehouse using Dataflow Gen2

Microsoft Fabric: How to load data in Warehouse using Data Pipeline | End-to-End Scenario

Microsoft Fabric: How to load data in Lakehouse using Spark; Python/Notebook

Microsoft Fabric: Integrated Machine Learning- k-Means clustering

Microsoft Fabric: Create a Date table in Dataflow Gen2, and use in Lakehouse and Warehouse| Time Intelligence with Power BI

Microsoft Fabric: How to load data in Lakehouse using Data Pipeline | End-to-End Scenario

Microsoft Fabric: What are the options to load local files in Lakehouse

Microsoft Fabric: Shortcut with Fabric and for Azure Data Lake | File vs Table Shortcut

Microsoft Fabric: Import Azure SQL Data to Warehouse | Multiple tables using Pipeline

Microsoft Fabric: Incremental Data Load(ETL) for Warehouse using Dataflow Gen 2, SQL Procedure, Data Pipeline

Microsoft Fabric: How about incremental without Dataflow, Only using pipeline and SQL procedure for Microsoft Fabric Warehouse

Microsoft Fabric: Load local SQL server data in Lakehouse using on-premise Gateway and Dataflow Gen2

Microsoft Fabric: Dataflow Gen2: Native warehouse Destination- Power Query Transformation — Pivot, Unpivot, Transpose Data

Microsoft Fabric: Load Azure SQL Data to Lakehouse & Warehouse simultaneously with Dataflow Gen2

Microsoft Fabric: Load Snowflake DB Data to Lakehouse & Warehouse simultaneously with Dataflow Gen2

Microsoft Fabric: How to load local SQL server data in Warehouse using Python, Pandas, and sqlalchemy

Microsoft Fabric: How to append only Incremental data using Data Pipeline in Lakehouse

Microsoft Fabric: Load incremental data from the local SQL server to Warehouse using on-premise Python

Microsoft Fabric: Load on-premise data file to Lakehouse using on-premise Python using OAuth Token

Introduction to Microsoft Fabric in Hindi | What it is, Why to Use, Who Should, When and How to use

Microsoft Fabric in Hindi Part 2: Create Workspace, Lakehouse, Warehouse

Microsoft Fabric in Hindi Part 3: How to load data in Lakehouse using Data Pipeline | End to End Scenario

Microsoft Fabric in Hindi Part 4: How to load data in Lakehouse using Dataflow Gen 2 | End to End Scenario

Microsoft Fabric in Hindi Part 5: How to load data in Warehouse using Dataflow Gen 2 | End to End Scenario

Microsoft Fabric in Hindi Part 6: How to load data in Warehouse using Data Pipeline | End to End Scenario

Microsoft Fabric Hindi part 7: How to load data in Lakehouse using Spark; Python using the notebook

Microsoft Fabric Hindi part 8: Do k-Means clustering in spark; analyze in Power BI | Machine Learning

To take your data journey to the next level, I’ll be providing more examples in my upcoming blogs and videos.

Complete Power BI in one Video — 11 Hours

Click here to access all my blogs and videos in a jiffy via an exclusive glossary using Power BI. Please like, share, and comment on these blogs. I would appreciate your suggestions for improvement, challenges, and suggestions for topics so that I can explore these in more depth.

In addition, I have over 500 videos on my YouTube channel that are related to Power BI, Tableau, and Incorta. With the help of these videos, you will learn hundreds of DAX and Power Query functions, in addition to hundreds of Power BI, use cases that will assist you in becoming an expert in Power BI. Make sure you subscribe, like, and share it with your friends.

Master Power BI
Expertise in Power BI
Power BI For Tableau User
DAX for SQL Users
Learn SQL

--

--

Amit Chandak

Amit has 20+ years of experience in Business Intelligence, Analytics, Data Science. He is Chief Analytics Officer at Kanerika & Power BI community Super User