DP-900-2

DP - 900
Module 2 - Structured Data and Relational Database Solutions

* When you move your databases to Azure, you have different solutions based on the level of control or management you want.

1. Running Databases on Azure Virtual Machines (IaaS)

You can transfer your existing Database Management System (DBMS) to Azure by installing it on an Azure Virtual Machine.

Examples: Microsoft SQL Server, MySQL, etc.

You manage:

- OS upgrades and patches

- DBMS upgrades and patches

- Data access and data import

This option gives you full control but requires high maintenance.

2. Serverless Database Solutions (PaaS)

If you want to reduce administrative tasks, consider a serverless approach.

Azure automatically handles servers, upgrades, and patches.

You will still manage data access and data import.

There’s no need to manage VMs or DBMS manually.

Azure SQL Database Family provides several options:

1). Azure SQL Database

This is a fully managed database service.

It works with most features of on-prem SQL Server, but not all.

Missing features include:

- Service Broker (for scheduling tasks)

- Database Mail (for sending emails from DBMS)

This option is ideal for single databases, but it has limited feature compatibility.

2). Azure SQL Managed Instance

This offers nearly 100% compatibility with on-prem SQL Server.

It provides more control over update and patch timings.

This is great for migration with minimal changes.

3). Elastic Pools

This allows multiple databases to share resources like CPU and memory.

Example:

OLTP (sales) uses resources during the day.

Inventory (batch jobs) uses them at night.

This saves costs by optimizing resource use.

4). Azure SQL Edge

This is designed for IoT environments.

It processes continuous data streams in real time.

It uses rolling time windows (e.g., 5-minute sliding window) instead of separate transactions.

5). Open Source Database Options

Azure also supports popular open-source databases:

- MySQL / MariaDB

- PostgreSQL

Use Azure Database for MySQL/PostgreSQL/MariaDB for managed open-source solutions. You can find third-party database engines on Azure Marketplace or install your own on a VM.

* Summary

Option	Type	Management	Best For
Azure VM + SQL Server	IaaS	You manage all	Full control
Azure SQL Database	PaaS	Azure manages	Single DBs
Managed Instance	PaaS	Shared control	Migration needs
Elastic Pools	PaaS	Azure manages	Cost-saving multi-DB setup
SQL Edge	PaaS	Azure manages	IoT & real-time data
Open-source DBs	PaaS/IaaS	Azure or You	Flexible tech stack

Creating Azure SQL

https://diigitaldrafts.blogspot.com/2025/11/dp-900-creating-azure-sql.html

Normalizing Data

* Structured data and relational databases are closely related; you can’t understand one without the other.

* Structured Data

- Structured Data is organized into tables, each with rows and columns.

- Each row represents a unique entity or record, such as one customer.

- Each column represents an attribute, like Customer ID, First Name, or Last Name.

- Each column has a specific data type, including text, number, or date.

Example:

Customer ID First Name Last Name

1 Peter Vogel

2 Jan Vogel

- It is easy to query, organize, and maintain. However, it can lead to duplication if not structured properly.

* Normalization

- Normalization is the process of organizing structured data into tables to reduce redundancy and improve consistency.

- Goal: Each table contains only relevant data for a single entity.

Example: The Customer table should have only customer-related information.

- Problem with Denormalized Tables

- A denormalized table mixes different types of data, such as customer and sales information. This can lead to issues like:

- Duplicate data where the same customer information repeats in multiple rows.

- Inconsistency where an address changes in one row but not in others.

Wasted storage.

Example of Denormalized Data:

Sales Order ID Customer Name Address Product Quantity

001 Peter Vogel Toronto Laptop 2

002 Peter Vogel Toronto Mouse 1

- If Peter changes his address, it must be updated in multiple rows, which likely causes errors.

- Normalized Solution

- Split the data into separate tables:

- Customer Table stores customer info.

- Sales Order Table stores order info.

- Product Table stores product info.

- Each record links through unique identifiers (keys), not by repeating data.

** Primary Key

- A Primary Key (PK) uniquely identifies each row in a table.

- Example:

CustomerID in the Customer Table.

SalesOrderID in the Sales Order Table.

- No two rows can share the same primary key value.

* Database Schema

- After normalization, we define a database schema, which is the structure of the database:

- It lists all tables in the database.

- It defines each table’s schema, including:

- Columns

- Data types

- Primary key

- Relationships with other tables

- Example:

Customer Table → CustomerID (PK), FirstName, LastName

Product Table → ProductID (PK), ProductName, Price

SalesOrder Table → SalesOrderID (PK), CustomerID (FK), ProductID (FK), Quantity

* Relationships Between Tables

- To combine data from multiple tables, use relationships (foreign keys) to link related rows.

Example:

- CustomerID in the SalesOrder Table links to CustomerID in the Customer Table.

- This makes it easy to find all orders for a specific customer.

* Key Takeaway:

- Structured data in relational databases must be normalized to avoid duplication and ensure data integrity. Through schemas and relationships, we create organized, efficient, and easily searchable databases.

Relationships in Structured Data

* After normalizing data, we have multiple related tables, each representing a single business entity. Most real-world transactions involve several entities. To connect these tables, we use relationships.

* Why Relationships Are Needed

- Example: A Sales Order involves:

- The Customer who made the purchase

- The Product that was sold

To provide complete information about one sales order, we must link the:

- Sales Order Table

- Customer Table

- Product Table

* Primary Key and Foreign Key

- Each table has a Primary Key (PK), which is a unique identifier for each record.

- To create a relationship, we copy the primary key from one table and place it as a new column in another table. That new column becomes the Foreign Key (FK).

- Example: CustomerID is the Primary Key in the Customer Table. The same CustomerID is added as a Foreign Key in the SalesOrder Table to link each order to its customer.

- So, if a user asks, “Who placed Sales Order A123?” Look up the order in the SalesOrder Table, then find CustomerID = 0002. Go to the Customer Table and find customer 0002 = Peter Vogel.

- That’s how a relationship connects data across tables.

Table	Key Type	Description
Customer Table	Primary Key → `CustomerID`	Identifies each customer
Product Table	Primary Key → `ProductID`	Identifies each product
SalesOrder Table	Foreign Keys → `CustomerID`, `ProductID`	Connects customer and product for each order

Thus, the SalesOrder Table depends on both the Customer and Product tables.

* Table Dependencies

Relationships create dependencies between tables.

Example:

- SalesOrder depends on Customer (who made the purchase)

- SalesOrder depends on Product (what was purchased)

- This structure ensures data integrity. Every order must link to a valid customer and product.

*Performance Impact

- While relationships improve data structure and accuracy, they can slow down performance. Each lookup requires checking related rows in multiple tables. With many customers and products, these lookups can become time-consuming.

- Tip: Proper indexing and database optimization can reduce lookup delays.

Concept	Description
Relationship	Link between tables using keys
Primary Key (PK)	Unique identifier for each record
Foreign Key (FK)	Copy of another table’s PK to create link
Dependency	One table relies on another (e.g., SalesOrder → Customer)
Challenge	Multiple lookups can affect performance

Key Takeaway: Relationships in structured data link multiple tables together using Primary and Foreign Keys. They ensure data consistency and logical connections, forming the backbone of relational databases.

Other DBMS Features

* When we normalize data, it gets stored in multiple related tables. For example, a sales order might pull information from the Customer, SalesOrder, and Product tables. To make data access faster, simpler, and more consistent, relational databases use Indexes, Views, and Stored Procedures.

1) Indexes: Speed Up Data Retrieval

- An Index works like an index in a book. It helps find data quickly without scanning the whole table.

- If you often search by City, create an index on that column. Now, rows are grouped and sorted by city for quick lookups.

a) Types of Indexes

- 1. Clustered Index

- Controls the physical row order in the table.

- Data is stored in sorted order (like 001, 002, 003).

- Usually on the Primary Key column.

- Only one clustered index is allowed per table.

- 2. Non-Clustered Index

- A separate structure that stores pointers to actual rows.

- Does not affect the physical order.

- You can create multiple non-clustered indexes for different columns.

- Too many indexes can slow down updates. Each insert, update, or delete must also update all related indexes. The ideal balance is around 6 to 7 indexes per table.

2) Views: Simplify and Secure Data Access

- A View is a virtual table. It shows data from one or more tables but does not store it physically.

- Use Views when you often need the same combination of columns or joins. For example, a SalesSummary View may combine columns from the SalesOrder, Product, and Customer tables.

** Benefits

- Simplifies queries with a predefined data structure; there's no need for repeated joins.

- Protects sensitive data by showing only selected columns.

- Restricts access, allowing apps to access the view instead of the actual tables.

- This is ideal for maintaining data privacy and ease of access.

3) Stored Procedures: Improve Performance and Consistency

- A Stored Procedure is a saved block of SQL commands that runs as one unt. Instead of multiple calls from the app to the server, one call executes all steps, improving speed.

**Benefits

- Better performance with fewer network calls and faster processing.

- Consistency as all applications call the same procedure, ensuring uniform results.

- Centralized logic; if business logic changes, you only need to update the procedure, not every app.

- For example, a procedure named ShipSalesOrder might handle all shipping tasks—updating orders, inventory, and status in one go. If the shipping process changes, you only need to update this procedure, and all apps stay consistent automatically.

Feature	Purpose	Benefits	Notes
Index	Speed up data search	Faster queries	Keep limited indexes
Clustered Index	Physical row order	Quick access via PK	Only one per table
Non-Clustered Index	Logical mapping	Flexible searches	Multiple allowed
View	Virtual table	Simplifies queries, hides data	Improves privacy
Stored Procedure	Predefined SQL set	Faster, consistent, reusable	Centralized logic

Querying and Updating the data

* SQL :

- SQL (Structured Query Language), pronounced either “S-Q-L” or “Sequel,” is the industry-standard language used for:

- Retrieving data from relational databases.

- Updating and modifying existing data.

- Defining and controlling database structures and access.

- The American National Standards Institute (ANSI) manages SQL with guidance from the International Standards Organization (ISO).

- Different database vendors offer their own extended versions:

- Microsoft SQL Server: Transact-SQL (T-SQL)

- Oracle: PL/SQL

- MySQL: MySQL SQL dialect

- The core SQL syntax remains consistent, making it transferable across systems.

* 3 Main Parts of SQL :

- SQL is divided into three main components, each serving a specific purpose:

1. DML (Data Manipulation Language)

This is used for retrieving and modifying data in tables. The main commands include:

- SELECT: Retrieve data from tables.

- UPDATE: Modify existing data in tables.

- DELETE: Remove rows from a table.

- INSERT: Add new rows to a table.

Common clauses are:

- FROM: Specifies the table to work with.

- JOIN: Combines data from multiple tables.

- WHERE: Filters rows based on conditions.

Tip: SQL can get complex with many clauses. Simple tasks are easy, but advanced queries can be challenging.

Examples:

-- SELECT

SELECT CustID, FirstName, LastName

FROM Customers

WHERE CustID = '0001';

-- UPDATE

UPDATE Customers

SET LastName = 'Irvin'

WHERE CustID = '0001';

-- DELETE

DELETE FROM Customers

WHERE CustID = '0001';

-- INSERT

INSERT INTO Customers (CustID, FirstName, LastName)

VALUES ('0003', 'Jason', 'Vandeville');

Always include a WHERE clause in DELETE to avoid removing all rows!

2. DDL (Data Definition Language)

This is used to define and manage database structures like tables, views, and indexes.

Main commands include:

- CREATE: Add new tables, views, or procedures.

- DROP: Delete existing tables or database objects.

- ALTER: Modify existing database structures.

Example:

CREATE TABLE Customers (

CustID VARCHAR(10),

FirstName VARCHAR(50),

LastName VARCHAR(50)

);

3. DCL (Data Control Language)

This is used to control access and permissions for users.

Main commands include:

- GRANT: Give users permission to read or modify data.

- REVOKE: Remove permissions previously granted.

Example:

GRANT SELECT, UPDATE ON Customers TO User1;

REVOKE UPDATE ON Customers FROM User1;

SQL Category	Purpose	Common Commands
DML	Manipulate data	SELECT, INSERT, UPDATE, DELETE
DDL	Define structure	CREATE, ALTER, DROP
DCL	Control access	GRANT, REVOKE

* Key Takeaways

- SQL is the universal language for relational databases.

- DML handles data operations, DDL manages structure, and DCL controls access.

- Knowing standard SQL lets you work across different DBMS like SQL Server, MySQL, PostgreSQL, and others.

- Always use WHERE clauses carefully to avoid unwanted data changes or deletions.

Running SQL Statements

https://diigitaldrafts.blogspot.com/2025/11/dp-900-sql-statements-in-azure-data.html

Transaction Integrity (ACID Properties)

Transaction Integrity, also known as Transactional Integrity or Managing Transactions.

* What Is a Transaction?

- A transaction is a set of operations that must be treated as a single logical unit.

- For example, when you transfer $10 from your Savings Account to your Checking Account, it involves two updates:

1. Subtract $10 from Savings

2. Add $10 to Checking

- Together, these two updates form one transaction.

* What Can Go Wrong?

- If the first update succeeds but the second fails, $10 disappears. If another user looks at the data between those two steps, they see inconsistent information. For instance, they might see $10 missing from savings but not yet added to checking.

- The bank's total amount ($70) and your accounts ($60) don’t match, causing inconsistency!

* Transaction Integrity Ensures:

- Data stays consistent

- All updates succeed or none do

- No half-completed transactions are visible

- Data remains safe even after a system failure

The 4 Rules of Transaction Integrity — ACID

Property	Meaning	Example (Bank Transaction)
Atomicity	All operations in a transaction are treated as one single unit. Either all happen or none happen.	Both updates (savings -$10, checking +$10) occur together or not at all.
Consistency	Transaction must move the database from one valid state to another valid state.	Total balance before and after transaction = $70.
Isolation	Ongoing transactions are invisible to others until complete.	Other users can only see the initial or final state, not in-between.
Durability	Once committed, the changes persist permanently, even if a system crash occurs.	After transaction completes, balances stay $10 and $60 even after reboot.

Atomic + Consistent + Isolated + Durable = ACID

Rollback vs Commit

Commit → Transaction completes successfully → changes are saved.

Rollback → Something fails → all updates are undone → data returns to its original consistent state.

For example, if the transfer fails after deducting from savings, the rollback restores both accounts to:

$20 in Savings

$50 in Checking

Data is safe and consistent again.

Concept	Description
Transaction	Group of SQL operations treated as one unit
Goal	Maintain accuracy, consistency, and reliability
ACID	Key rules ensuring transaction integrity
Rollback	Reverts to previous consistent state if any error occurs
Commit	Confirms all updates and saves final state

In Short:

Transaction Integrity = ACID. It ensures the database stays accurate, consistent, isolated, and durable. This means no lost money, no half updates, and no corrupted data.

Costs & Benefits of Relational Databases

* Costs and benefits of using relational databases and why we sometimes need other storage solutions like unstructured data or data warehouses.

* Benefits of Relational Databases

- Relational databases are popular for their flexibility, accuracy, and consistency.

1) Flexible Design:

- Data is spread across multiple related tables.

- Example: From Customer to Sales Order to Product, we can easily answer questions like “Which customers bought which products?”

2) Supports Many Business Scenarios:

- Because of relationships and normalization, we can combine tables in many ways to answer different business questions.

3) No Duplicate Data:

- Each entity’s data is stored only once.

- Example: All customer info lives in the Customer table, avoiding redundancy and ensuring consistency.

* Costs / Limitations of Relational Databases

- Even though relational databases are powerful, they face performance and structural challenges.

1. Multiple Table Updates

- Most business transactions, like transferring money, involve updating several tables.

- This slows down online transaction processing (OLTP), especially when many users are active at the same time.

2. Fixed Schema (Rigid Structure)

- Every row in a table must have the same columns.

- But in real life, data types differ.

- Example:

Books have ISBN

Stoves have safety warnings

Computers have specifications

Forcing these diverse entities into one rigid structure causes inefficiency and complexity.

3. Reporting Performance

- Reports often require processing thousands of rows.

- To create reports, the system must join multiple tables (Customer, SalesOrder, Product).

- This slows down query performance; large reports can take too long to generate.

Solutions to These Problems

Problem	Best Solution	Description
Too many transactions per second (performance issue)	Unstructured Data Systems	Handle large, fast data streams better.
Different data shapes / flexible formats needed	Unstructured or Semi-Structured Storage	Works better when entities don’t share the same schema.
Reporting is too slow due to joins	Data Warehouse (Denormalized Data)	Combines tables (Customer + Sales + Product) into one big table for faster reporting.

* Note:

- Denormalized data means faster reads and slower updates.

- It’s great for reporting, but not for real-time transactions.

Aspect	Structured Data (Relational DB)	Unstructured / Data Warehouse
Design	Highly structured (tables, rows, columns)	Flexible / combined tables
Speed	Slower with large joins	Faster for bulk reads
Use Case	Daily transactions (OLTP)	Analytics, reports (OLAP)
Example	Bank, E-commerce apps	Business intelligence dashboards

In Short

Relational Databases are consistent, reliable, and flexible.

Yet, they struggle with high-speed transactions, diverse data, and large-scale reporting.

Therefore, we use Unstructured Data for flexibility and Data Warehouses for efficient reporting.

Search This Blog

DigitalDrafts

DP-900-2

DP - 900
Module 2 - Structured Data and Relational Database Solutions

Creating Azure SQL

Normalizing Data

Relationships in Structured Data

Other DBMS Features

Querying and Updating the data

Running SQL Statements

Transaction Integrity (ACID Properties)

Costs & Benefits of Relational Databases

Comments

Post a Comment

Popular posts from this blog

AI-900-3,4

AI-900 12,13

AI-900 10,11

DP-900-2

DP - 900 Module 2 - Structured Data and Relational Database Solutions

Creating Azure SQL

Normalizing Data

Relationships in Structured Data

Other DBMS Features

Querying and Updating the data

Running SQL Statements

Transaction Integrity (ACID Properties)

Costs & Benefits of Relational Databases

Comments

Post a Comment

Popular posts from this blog

AI-900-3,4

AI-900 12,13

AI-900 10,11

DP - 900
Module 2 - Structured Data and Relational Database Solutions