BookmarkSubscribeRSS Feed

Manipulating Data in SAS Studio Flows Part 1: Appending Data

Started 2 weeks ago by
Modified 2 weeks ago by
Views 132

SAS Studio Flow functionality has been booming over the last several months. In the last year, dozens of steps have been added and even more have been updated! That means now is better than ever to learn some SAS Studio Flow basics. I’ll be posting a series of blogs covering how to do common data management tasks in SAS Studio Flows, starting with appending data.

 

Insert Rows step

 

The Insert Rows step uses a SQL INSERT statement to add rows to a table from a generated query. Use of this step requires a SAS Studio Analyst license.

 

This step allows you to append rows from the input table to existing rows in the output table, replace the output table data with the new input table data, or populate a new table with rows from the input table. Read the documentation for a full description of step capabilities. 

 

Simply connect your source table to the input port, connect your target table to the output port, check your column resolution, and review the output table options. The options include creating a table if physical table does not exist and deleting all existing rows in the output table.

 

For Insert Rows to run successfully, source and target tables must have at least one mutual column (column with the same name and same type). If not, you can edit your target table’s column structure on the table node’s Published columns tab or by adding a column editing node to your flow.

 

Let’s review an example. I have a target table, PAYROLL, with pay information for fictitious employees. I’d like to append some new rows from PAYROLL2 to this table.

 

grbarn_append_1.png

Select any image to see a larger version.
Mobile users: To view the images, select the "Full" version at the bottom of the page.

 

When I connect my source and target tables, there’s an info symbol on the Insert Rows node. This tells me there’s an issue with the column resolution between these tables.

 

grbarn_append_2.png

 

PAYROLL has an IdNumber column, while PAYROLL2 has an idnum column. I’ll correct this by adding a Manage Columns step and renaming idnum to match.

 

grbarn_append_3.png

 

Now, the column resolution has been resolved and I can run this flow with no issues. The log shows that a SQL INSERT statement was submitted to append this data.

 

grbarn_append_4.png

 

Load Table step

 

The Load Table step can insert, update, or upsert rows from a source table to a target table. Not only does this step boast update capabilities, but the user can select their preferred insert method: PROC SQL INSERT statement or PROC APPEND. (Note that this feature became available in the 2023.09 Viya stable release). Use of this step requires a SAS Studio Engineer license.

 

This step performs three primary functions. First, you can insert new source table rows into the target table, with preprocessing action options to control what happens to existing target table rows. Second, you can update existing rows in your target table based on matching key column values (like a product’s price or an employee’s salary). Lastly, you can do inserts and updates at once with the “upsert” option. Read the documentation for a full description of step capabilities. 

 

The Load Table step configuration is slightly different from Insert Rows. Connect your source table to the input port and select your target table on the Target Table node tab. Then, configure your insert or update settings on the options tab. If you’re creating a new table, you can define the table metadata within the Load Table step. Otherwise, you can review source and target column mapping on the Column Resolution tab.

 

Another plus is that the Load Table step supports bulk loading for certain database tables, which promotes efficiency.

 

Let’s review an example working with the same tables, PAYROLL and PAYROLL2.

 

grbarn_append_5.png

 

PAYROLL2 and the Manage Columns node are connected to the Load Table input port. PAYROLL is selected as the target table.

 

All table load options are set on the options tab. First, I want to insert rows with PROC SQL, but note that I could do an update or upsert as well.

 

grbarn_append_6.png

 

This step and setting produced code similar to our previous results from the Insert Rows step.

 

grbarn_append_7.png

 

I can also change the insert method to PROC APPEND. I’ll force table concatenation for non-matching columns and suppress warnings in the log.

 

grbarn_append_8.png

 

Now, the generated and submitted code has changed to use PROC APPEND.

 

grbarn_append_9.png

 

Merge Table step

 

The Merge Table step uses a SQL MERGE statement to insert, update, or upsert rows from a source table to a target table. This functionality is very similar to the Load Table step, but it’s limited to certain databases that allow SQL merges (Oracle, Teradata, Snowflake, SQL Server, etc). Since this step processes in-database, this makes for very efficient insert/update operations. This step also requires a SAS Studio Engineer License.

 

Read the documentation for a full description of step capabilities. For a full Merge Table step tutorial, check out my post Data Integration with the Merge Table step in SAS Studio Flows.

 

Append Table custom step

 

The Append Table custom step, available through the custom step repository, uses PROC APPEND to append new rows to a target table. Use of SAS custom steps in a flow requires a SAS Studio Analyst license.

 

For this step, connect your source table to the input port and the target table to the output port. If your target table does not exist, your source table will simply be copied as the target table. Like Load Table, you can choose to force append and suppress warnings in the log.

 

grbarn_append_10.png

 

This step uses SAS macros, which can be reviewed in the log. The PROC APPEND operation can be reviewed here.

 

grbarn_append_11.png

 

 

Considerations and Summary

 

Which tool should you use for your needs? Consider factors like your SAS Studio license, your data type, and the amount of rows you want to append. You can use the following table for a quick reference guide:

 

grbarn_append_12.png

 

In this post, I’ve discussed four different point-and-click steps for appending data in SAS Studio Flows: Insert Rows, Load Table, Merge Table, and the Append Table custom step. Stay tuned for future posts on common data manipulation tasks in SAS Studio Flows!

 

Did you learn a new way to append data through this post? Let me know in the comments!

Version history
Last update:
2 weeks ago
Updated by:
Contributors

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Free course: Data Literacy Essentials

Data Literacy is for all, even absolute beginners. Jump on board with this free e-learning  and boost your career prospects.

Get Started

Article Tags