BookmarkSubscribeRSS Feed

Data Integration Made Easy with the Load Table Step

Started yesterday by
Modified yesterday by
Views 52

In 2022, my colleague MK Queen wrote a post on the new Load Table step in SAS Studio Flows, which enables you to load a source table into a target table while controlling how data is loaded. Since then, the Load Table step has been updated to support additional data types, include new features and options, and improve the node's appearance and flexibility on the flow canvas. Allow me to reintroduce you to the Load Table step in its newest form!

 

In this post, I'll demonstrate three ways to use the Load Table step: loading data into a new database table, inserting new rows into an existing table, and "upsert"ing (updating and inserting) new rows into an existing table. Keep reading to learn how to integrate your data with ease, regardless of where it's stored or where it's going!

 

The Load Table Step

 

With the Load Table step, you can efficiently load data from a source table into a target table while utilizing a variety of options to optimize processing. You can choose one of three load methods: Insert new rows into the target table, Update existing rows in the target table with new values (based on key columns), or Upsert rows (which performs an insert and an update at the same time). When inserting rows, you can utilize pre-processing options like truncating the target table, deleting all rows, or replacing the target table entirely. When loading to Oracle, Snowflake, SingleStore, Azure Synapse, or Teradata target tables, you can set output table options to improve your load efficiency.

 

The Load Table step is flexible and customizable for your needs! The only requirement for your data is that the source and target tables need to have columns with the same name and type prior to using this step. Refer to the graphic below for a summary of step features:

 

01_grbarn_load_1.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

What's changed since the Load Table step was first released?

 

  • You can automatically create a physical target table if one does not exist when loading data with the Upsert Rows technique (previously, this capability was limited to the Insert Rows technique only).
  • You can load data into a Snowflake, Azure Synapse, or SingleStore target table (in addition to previously supported table types)
  • You can use macro variables to represent table and library names in both the Table and Load Table steps
  • If both the source and target tables are located in a SingleStore database, you can use native SQL code for the Upsert Rows technique, which can improve performance
  • Most recently, as of LTS 2025.09, you can display or expand the output port for the Load Table step (which was previously hidden)

 

Visit the documentation for more information on step capabilities. The Load Table step is available with the SAS Studio Engineer license.

 

Scenario

 

In this post, I'll be using tables that I adapted from the PRDSALE, PRDSAL2, and PRDSAL3 tables from the SASHELP library. These tables store furniture sales data, including actual sales, predicted sales, product type, product, quarter, year, month, and more. I updated each table to include a numeric ORDER_NUMBER variable, which assigns an ID number to each row. Additionally, I converted MONTH into a character variable instead of a SAS date variable.

 

The first table, PRDSALE1, is a SAS table that stores furniture sales data from 1993 to 1994. PRDSALE1 has 11 columns and 1,440 rows.

 

02_grbarn_load_2.png

 

Similarly, PRDSALE2 stores furniture sales data from 1995 to 1998. PRDSALE2 is a larger table with 11 columns and 23,040 rows.

 

03_grbarn_load_3-1024x372.png

 

PRDSALE3 is the final table, storing furniture sales data from 1997 to 1998. This includes some updated rows and some new rows. PRDSALE3 has 11 columns and 11,520 rows.

 

04_grbarn_load_4-1024x374.png

 

I'll use the Load Table step to load these SAS datasets into one Oracle database table.

 

Loading data into a new database table

 

First, I'll load PRDSALE1 into my Oracle database.

 

05_grbarn_load_5.png

 

After I connect my table to a Load Table node, I need to set the target table options. My target table doesn't exist yet, so I'll manually type in my library name, SASORA, and my table name, PRDSALE_DB.

 

06_grbarn_load_6-1024x306.png

 

This surfaces a warning message: No columns were found. The table might not exist yet. Select a different target table or use the Column Structure tab to load or copy the table structure. Recall that one of the requirements of the Load Table step is that the source and target tables must have columns with the same name and type. If the target table doesn't exist yet, we'll need to add in the expected table structure on the Column Structure tab.

 

07_grbarn_load_7-1024x335.png

 

08_grbarn_load_8-1024x393.png

 

After I select the option to Copy structure from source, the expected column structure appears (and the warning message goes away). Now, I can configure my load on the Options tab.

 

09_grbarn_load_9.png

 

I'll choose Insert rows as my load technique and I'll check the option box to create a table if a physical table does not exist. I'll leave the default insert method option, which is the PROC SQL INSERT statement, though you can also select PROC APPEND. Lastly, I'll select the preprocess action to Replace the target table if it exists. That way, if I run this step multiple times, I won't be loading duplicate rows into one table.

 

10_grbarn_load_10.png

 

Once my settings are configured, I can run the node. After the step has run successfully, I can right-click the output table and select Expand to view the target table separately from the Load Table node.

 

11_grbarn_load_11-1024x390.png

 

By previewing my data and checking the log, I can confirm that PRDSALE1 was loaded into Oracle.

 

Inserting new rows into a target table

 

Next, I want to insert the new rows from PRDSALE2 into PRDSALE_DB.

 

12_grbarn_load_12.png

 

I've connected PRDSALE2 to a new Load Table node on my flow canvas and added the target table information to the node.

 

13_grbarn_load_13-1024x475.png

 

I don't need to add the column structure, but I will check the column resolution now that I'm working with two different tables. I can see that my columns do not match up perfectly between tables. There are 9 columns with the same name and type that map between both tables, which means I'll be able to insert rows successfully. However, PRDSALE2 has STATE and COUNTY columns, while PRDSALE_DB has REGION and DIVISION columns. This means that any existing STATE and COUNTY values will not be inserted into PRDSALE_DB, and REGION and DIVISION values will be missing for rows inserted from PRDSALE2. I don't need this data for the sake of this demo, so this column mismatch is fine, but be mindful of this when working with tables that have different column structures.

 

14_grbarn_load_14.png

 

Next, I'll set the load options. I'll choose Insert rows as my load technique again, but I won't select the options to create a table or to do any preprocessing. Instead, since I'll be loading a table with over 20,000 rows, I'll expand the output table options and select Bulk load. This will use the Oracle bulk-load facility to process the dataset faster.

 

15_grbarn_load_15.png

 

After running the step, I can check the log to confirm that the rows were inserted with bulk loading. I can also read the SQL Loader log file for more information on the load.

 

Upserting rows into a target table

 

Lastly, I'll perform an upsert with the PRDSALE3 table, which contains a mixture of new rows and updates to existing rows.

 

16_grbarn_load_16.png

 

Like with the previous sections, I've connected PRDSALE3 to a new Load Table node, added the target tale information, and double-checked the column resolution.

 

17_grbarn_load_17.png

 

When I select Upsert rows for my load technique, my options change. Now, I need to add a key column to match rows for any row updates. I've selected ORDER_NUM as my key column. I can still select an insert method, but pre-processing options go away now that we're doing more than a simple insert. I can still use my output table options, and I'll choose to bulk load again since this table has over 10,000 rows.

 

18_grbarn_load_18.png

 

19_grbarn_load_19.png

 

Note that upserts may take longer to run than inserts or updates because they are performing both actions at once. After running the step, I can check the log to confirm that both inserts and updates occurred. First, 2,000 rows were updated, then 9,520 rows were inserted into the target table.

 

Now, I have successfully loaded my three SAS tables PRDSALE1, PRDSALE2, and PRDSALE3 into one target database table, PRDSALE_DB.

 

Summary

 

In this post, I showed you how to use the Load Table step to handle multiple data integration scenarios. If you want to learn more about data integration in SAS Studio, check out these posts:

 

 

Do you use the Load Table step already, or is this the first time you're hearing about it? Did you learn anything new about the Load Table step from this post? Share your thoughts, questions, and feedback below!

 

 

Find more articles from SAS Global Enablement and Learning here.

Contributors
Version history
Last update:
yesterday
Updated by:

SAS AI and Machine Learning Courses

The rapid growth of AI technologies is driving an AI skills gap and demand for AI talent. Ready to grow your AI literacy? SAS offers free ways to get started for beginners, business leaders, and analytics professionals of all skill levels. Your future self will thank you.

Get started