Every now and then I find myself need reminding one of this handy queries. While googling them is easy enough, it is easier to have it all in my own blog.
Here is some of my Teradata Queries that I often need for my work, and perhaps you need for your work.
Create Table with Select
I often forgotten the complete syntax of this very useful query.
--to create one on the database CREATE MULTISET TABLE db.table_name AS ( SELECT * FROM source_data WHERE conditions = value ) WITH DATA PRIMARY INDEX (ID, SNAP_DATE); -- and the volatile table version CREATE MULTISET VOLATILE TABLE vt_table_name AS ( SELECT * FROM source_data WHERE conditions = value ) WITH DATA PRIMARY INDEX (ID, SNAP_DATE) ON COMMIT PRESERVE ROWS; --and to create one without select CREATE VOLATILE TABLE vt_table ( ACCT_ID VARCHAR(10) ,SNAP_D DATE ) ON COMMIT PRESERVE ROWS;
Qualify Rank or Row_number Statement
Qualify Rank() or Row_number() can be used as a filter condition, or as a created column/value inside the select statement. If this select query is created in a table, the values then can be used in a where clause.
--to use it as a filter SELECT * FROM [SOURCE_DATA] WHERE [CONDITION] = 1 QUALIFY RANK() OVER (PARTITION BY [ID] ORDER BY [ORDERED_COLUMN]) = 1 --or QUALIFY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY [ORDERED_COLUMN]) = 1 --to create rank or row_number as value. SELECT * ,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY [ORDERED_COLUMN]) AS ROW_NUM FROM [SOURCE_DATA] WHERE [CONDITION] = 1
If you wonder the difference between Rank() and Row_number(), the difference is that the same value can have the same rank number. For example if there are two smallest numbers, then the first two values will both have Rank value 1, then the third number will have rank = 3, and so on. While Row_number will randomly choose and assigned a value for each row, and thus no duplicate value.
Analytical Functions Using Partition By
This is very useful to quickly get a summary or average for a group of data without running the query with multiple where clause or copying to Excel.
--To get total amount per particular value (like customer, or division, etc) SUM(amount) over (PARTITION BY CUST_ID) As Total_amount_per_customer --Same as above, but value will be ordered by Account ID SUM(amount) over (PARTITION BY CUST_ID ORDER BY ACCT_ID) As Total_amount_per_customer --Same as above, but value will be Cumulative SUM(amount) over (PARTITION BY CUST_ID ORDER BY ACCT_ID ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) As Total_amount_per_customer
Preceding and Following
This syntax is useful to get the value of the previous row. However this works only if the previous value is not calculated value. If it is, then recursive query is the way.
This is useful, for example, when trying to compare the difference between previous and current period balances.
SUM(amount) over (PARTITION BY CUST_ID ORDER BY ACCT_ID ROWS BETWEEN UNBOUNDED PRECEDING AND PRECEDING ROW ) As prev_amount SUM(amount) over (PARTITION BY CUST_ID ORDER BY ACCT_ID ROWS BETWEEN UNBOUNDED FOLLOWING AND FOLLOWING ROW ) As next_amount
This is by far the most often googled query I have ever done.
Recursive Query can be used to setup tree-based/node-based hierarchy, or to get the calculated value of previous row.
In the example below, the code used calculated value of previous level in the current level.
WITH RECURSIVE NSEED AS ( SELECT ID ,10+1 AS VAL ,LEVEL --predefined value FROM [SOURCE_TABLE] WHERE LEVEL = 1 UNION ALL SELECT CURR.ID ,PREV.VAL + 1 AS VAL_NEW --new value based on previous level ,CURR.LEVEL FROM NSEED AS PREV JOIN [SOURCE_TABLE] AS CURR ON PREV.ID = CURR.ID AND PREV.LEVEL + 1 = CURR.LEVEL --set the next hierarchy level ) SELECT * FROM NSEED
In the above example, column [level] is already predefined. But what if you don’t have such column? You can simply create them based on a rank and populate them with the qualify statement as mentioned above.
Check Table Skew Factor
In Teradata, tables with high skew factor eats a large chunk of storage. It is a good idea to check the efficiency of your table from time to time.
SELECT TABLENAME, SUM(CURRENTPERM) /(1024*1024) AS CURRENTPERM, (100 - (AVG(CURRENTPERM)/MAX(CURRENTPERM)*100)) AS SKEWFACTOR FROM DBC.TABLESIZE WHERE DATABASENAME= <your database> AND TABLENAME =<your table> GROUP BY 1;
Find the table owner
If you are like me, often don’t remember the tables that you created, or simply want to find out who is the creator of a certain table, then this query is for you.
SELECT DatabaseName, TableName, CreateTimeStamp, LastAlterTimeStamp, CreatorName FROM DBC.TablesV WHERE TableKind = 'T' --T for table, V for View and DatabaseName = '[Change to your DB Name]' ORDER BY TableName;
Simple update query, always forgot the syntax.
--simple basic update query UPDATE <your table> SET <your column> = <your value> WHERE <condition> = 1 --this is most important, otherwise the query will update all records --update query if you need to join UPDATE TGT FROM <your table to update> as TGT ,<source table> as SRC SET <your column> = <your value> WHERE TGT.ID = SRC.ID --join criteria AND TGT.<conditions> = 1
Alter table to Add New Column
Simple enough, code to add new column on existing table.
ALTER TABLE <your table> ADD <new column> DATE DEFAULT NULL;