All About Snowflake Database (2)
API Integration
Dora Auto-Grader Tool (Powered by Snowflake)
Dora is a tool used by Snowflake University to auto-grade lab work and issue badges.
Technical Architecture (For Seasoned Professionals)
Dora's grading process involves several components:
Snowflake External Function: This is the main component that students call.
API Integration: The External Function uses API Integration to securely connect to external services.
REST API Call: The API Integration sends a REST API call to an AWS Lambda function.
Lambda Function: This external serverless function receives the data.
It logs the information received and keeps track of the Snowflake Account Locator that initiated the call.
Data Analysis: The Lambda function then loads the data back into a Snowflake account for analysis and assessment against the lab criteria.
Badge Issuance: If successful, a badge is issued.
Learner Setup Requirements
To use Dora, each learner must complete two one-time setup steps for each Snowflake trial account:
Run API Integration Script: A short script must be run once for each Snowflake trial account to create the necessary API integration.
Create GRADER Function: Run a snippet of code to create the External Function in a database.
This function is usually created in a database named GRADER.
How the Learner Submits Work
The auto-grading process has three stages that the learner executes:
Lab Task: The learner completes a task (for example, "set up this table and load three rows of data").
Check Code: The learner runs provided code to verify their work locally ("check to see if you set up that table...").
Grader Function Submission: The learner runs the check code snippet wrapped inside the GRADER function to send the results to Dora ("send this code to the auto grader...").
Create An API Integration :
Step 1 : Create a new SQL file called API Integration. Select role as Account Admin. Select database as Demo db and schema as public.
Run Below given query
use role accountadmin;
create or replace api integration dora_api_integration
api_provider = aws_api_gateway
api_aws_role_arn = 'arn:aws:iam::321463406630:role/snowflakeLearnerAssumedRole'
enabled = true
api_allowed_prefixes = ('https://awy6hshxy4.execute-api.us-west-2.amazonaws.com/dev/edu_dora');Create the Grader Function
The GRADER function is a specific Snowflake External Function used mainly in Snowflake training, workshops, or learning paths. It automatically checks your progress and verifies your results.
What the GRADER Function Does
This function serves as an auto-grader for hands-on exercises:
Validation: When you run a query that calls UTIL_DB.PUBLIC.GRADER(...), the function takes in parameters like your actual result, the expected result, and a step description.
External Call: As an External Function, it doesn't run its code within Snowflake. Instead, it securely sends your results online to an external web service at this URL: https://awy6hshxy4.execute-api.us-west-2.amazonaws.com/dev/edu_dora/grader.
Grading: This external service processes the results by comparing your actual value to the expected value. It logs your progress for the specified step.
Feedback/Badge: The external service often returns a result to Snowflake, indicating whether you passed or failed the step. Completing all steps tracked by the grader usually leads to earning a badge or certificate for the course.
Key Technical Details
External Function: This is a special type of User-Defined Function (UDF) in Snowflake that calls a function hosted outside of Snowflake, specifically on AWS API Gateway and a backend service.
API_INTEGRATION = dora_api_integration: This setting informs Snowflake which secure connection and authentication method to use when communicating with the external web service.
Parameters:
step: The name or ID of the specific exercise step you are currently on.
passed: A boolean (TRUE/FALSE) that shows whether your SQL result matched the correct answer.
actual/expected: The values you achieved compared to the required ones.
description: Additional details about the test.
In summary, you are creating a secure communication link from your Snowflake environment to a Snowflake education server. This setup allows your exercise results to be automatically checked and recorded.
Run below given query :
use role accountadmin;
create or replace external function demo_db.public.grader(
step varchar
, passed boolean
, actual integer
, expected integer
, description varchar)
returns variant
api_integration = dora_api_integration
context_headers = (current_timestamp, current_account, current_statement, current_account_name)
as 'https://awy6hshxy4.execute-api.us-west-2.amazonaws.com/dev/edu_dora/grader'
;
Check if the grader function is working.Run the following Query.use role accountadmin;
use database demo_db;
use schema public;
select grader(step, (actual = expected), actual, expected, description) as graded_results from (SELECT
'DORA_IS_WORKING' as step ,(select 123) as actual ,123 as expected ,'Dora is working!' as description );Checking schemas in GARDEN_PLANTS database.Queries to Alter the database.Typo: ALTER SCHEMA GARDEN_PLANTS.WEGGIES RENAME TO GARDEN_PLANTS.VEGGIES;
Wrong Place: ALTER SCHEMA DEMO_DB.VEGGIES RENAME TO GARDEN_PLANTS.VEGGIES;
SELECT * FROM GARDEN_PLANTS.INFORMATION_SCHEMA.SCHEMATA where schema_name in ('FLOWERS','FRUITS','VEGGIES');select count(*) as schemas_found, '3' as schemas_expected from GARDEN_PLANTS.INFORMATION_SCHEMA.SCHEMATA where schema_name in ('FLOWERS','FRUITS','VEGGIES');Using the INFORMATION_SCHEMA to Query Metadata Metadata Definition: The term metadata means "data about data." Location: The INFORMATION_SCHEMA is a special schema that is automatically created in every Snowflake Database. Purpose: The INFORMATION_SCHEMA holds metadata for the database it is in. Contents: It contains statistics and details about the database's objects, including: - The number of databases, schemas, tables, and views. - Object names and other object-specific details. Practical Use: People often use the INFORMATION_SCHEMA to check their work and confirm that tasks, like creating tables or views, were done correctly by querying the metadata it holds.Run the below Query.Run Each Statement in Your Worksheet to Send a Report to DORA About Your Schemas
--You can run this code, or you can use the drop lists in your worksheet to get the context settings right. use database DEMO_DB;
use schema PUBLIC;
use role ACCOUNTADMIN;
--Do NOT EDIT ANYTHING BELOW THIS LINE
select GRADER(step, (actual = expected), actual, expected, description) as graded_results from (
SELECT 'DWW01' as step ,( select count(*)
from GARDEN_PLANTS.INFORMATION_SCHEMA.SCHEMATA
where schema_name in ('FLOWERS','VEGGIES','FRUITS')) as actual ,3 as expected ,'Created 3 Garden Plant schemas' as description
);Run the below given query.--Remember that every time you run a DORA check, the context needs to be set to the below settings.
use database DEMO_DB;
use schema PUBLIC;
use role ACCOUNTADMIN;
--Do NOT EDIT ANYTHING BELOW THIS LINEselect GRADER(step, (actual = expected), actual, expected, description) as graded_results from (
SELECT 'DWW02' as step
,( select count(*)
from GARDEN_PLANTS.INFORMATION_SCHEMA.SCHEMATA
where schema_name = 'PUBLIC') as actual
, 0 as expected
,'Deleted PUBLIC schema.' as description
);Run Each Statement in Your Worksheet to Send a Report to DORA About Your Tables.
Run the following queries.-- Do NOT EDIT ANYTHING BELOW THIS LINE
-- Remember to set your WORKSHEET context (do not add context to the grader call)select GRADER(step, (actual = expected), actual, expected, description) as graded_results from (
SELECT 'DWW03' as step
,( select count(*)
from GARDEN_PLANTS.INFORMATION_SCHEMA.TABLES
where table_name = 'ROOT_DEPTH') as actual
, 1 as expected
,'ROOT_DEPTH Table Exists' as description
);Use Query History to Check your test results.Go to home page. From sidebar section select monitoring inside that select Query History.Run this in your worksheet to send report to DORA.--Set your worksheet drop list role to ACCOUNTADMIN --Set your worksheet drop list database and schema to the location of your GRADER function
-- DO NOT EDIT ANYTHING BELOW THIS LINE. THE CODE MUST BE RUN EXACTLY AS IT IS WRITTEN select GRADER(step, (actual = expected), actual, expected, description) as graded_results from ( SELECT 'DWW04' as step ,( select count(*) as SCHEMAS_FOUND
from DEMO_DB.INFORMATION_SCHEMA.SCHEMATA) as actual , 2 as expected , 'DEMO_DB Schemas' as description
);--Set your worksheet drop list role to ACCOUNTADMIN --Set your worksheet drop list database and schema to the location of your GRADER function
-- DO NOT EDIT ANYTHING BELOW THIS LINE. THE CODE MUST BE RUN EXACTLY AS IT IS WRITTEN select GRADER(step, (actual = expected), actual, expected, description) as graded_results from (
SELECT 'DWW05' as step
,( select row_count
from GARDEN_PLANTS.INFORMATION_SCHEMA.TABLES
where table_name = 'ROOT_DEPTH') as actual
, 3 as expected
,'ROOT_DEPTH row count' as description
);Loading Data into SnowflakeAlternatives to INSERT Statements You do not need to repeat INSERT statements for each row of data you want to load. There are many ways to load data into Snowflake. One common and efficient method is to load data from CSV files using Snowflake's load wizard. Data Preparation and Checks (Before Loading) Before trying to load data from files like CSVs, you should check the source files for specific features: Delimiters: Check what symbols separate columns (fields). Examples include commas, tabs, or other characters. Header Rows: Determine if the file has a header row, which contains column names. Other Stuff: Check for additional structural details relevant to the file format.Create CSV file :CSV File Structure and InspectionCSV Definition: CSV stands for Comma Separated Values.Delimiter: In a CSV file, values are separated by a comma (,) between each value in a row.Data Example: In Uncle Yer's file, the words "Deep," "Shallow," and "Medium" were shortened to their first letters (D, S, M).Viewing in Spreadsheet Programs: Programs like Excel or Google Sheets automatically interpret the commas as separators. They display the values in different columns, so users do not see the commas.Important Inspection Method: It is important to open the file with a simple text editor, like Notepad or BBEdit. This lets you see the actual commas separating the values. This method is needed because other characters, such as tabs or pipes, are sometimes used as separators in data files.Create a Vegetable Details Table :Query :create table garden_plants.veggies.vegetable_details ( plant_name varchar(25) , root_depth_code varchar(1) );Create Excel File :From the sidebar go to the Catalog. And go to your database and click on load data.Select your excel file and select database, schema and table to upload the data.If you see this kind of error then remove the column name from excel data.Select Generate SQL from table.Run this query.SELECT * FROM "GARDEN_PLANTS"."VEGGIES"."VEGETABLE_DETAILS" LIMIT 10;Load the second file in database.Run below Query. SELECT * FROM "GARDEN_PLANTS"."VEGGIES"."VEGETABLE_DETAILS" ORDER BY PLANT_NAME DESC LIMIT 20;Run below command to send report to dora.--Set your worksheet drop list role to ACCOUNTADMIN --Set your worksheet drop list database and schema to the location of your GRADER function
-- DO NOT EDIT ANYTHING BELOW THIS LINE. THE CODE MUST BE RUN EXACTLY AS IT IS WRITTEN select GRADER(step, (actual = expected), actual, expected, description) as graded_results from ( SELECT 'DWW06' as step ,( select count(*)
from GARDEN_PLANTS.INFORMATION_SCHEMA.TABLES
where table_name = 'VEGETABLE_DETAILS') as actual , 1 as expected ,'VEGETABLE_DETAILS Table' as description
);View Your Vagetable details table.View Your Table data.Notice a plant that appears twice in list.Find a way to get rid of it.View data again.Run below Query.select * from "GARDEN_PLANTS"."VEGGIES"."VEGETABLE_DETAILS"Spinach is two times in table.select * from "GARDEN_PLANTS"."VEGGIES"."VEGETABLE_DETAILS" where PLANT_NAME = 'Spinach';One of them has an "S" for shallow roots and the other has "D" for deep roots. We need to get rid of the row that says spinach roots are deep. First lets isolate the "D" row.Run below Query.select * from "GARDEN_PLANTS"."VEGGIES"."VEGETABLE_DETAILS" where PLANT_NAME = 'Spinach' AND ROOT_DEPTH_CODE = 'D';Remove the row.Run this query.delete from "GARDEN_PLANTS"."VEGGIES"."VEGETABLE_DETAILS" where PLANT_NAME = 'Spinach' AND ROOT_DEPTH_CODE = 'D';check the table data again and make sure no plant names are repeated.Run this code to send data to DORA.--Set your worksheet drop list role to ACCOUNTADMIN --Set your worksheet drop list database and schema to the location of your GRADER function
-- DO NOT EDIT ANYTHING BELOW THIS LINE. THE CODE MUST BE RUN EXACTLY AS IT IS WRITTEN select GRADER(step, (actual = expected), actual, expected, description) as graded_results from ( SELECT 'DWW07' as step ,( select row_count
from GARDEN_PLANTS.INFORMATION_SCHEMA.TABLES
where table_name = 'VEGETABLE_DETAILS') as actual , 41 as expected , 'VEG_DETAILS row count' as description
);Other ways to create data rows or interacting with the data.Create the table flower details.Copy the query of Vegetable_details Table.
Comments
Post a Comment