Watch this Ask the Expert session to learn the basics of SQL Joins, the SAS DATA Step Merge and the SAS Studio Query Utility. I’ll also demonstrate, compare and contrast each of these methods for combining data in SAS.
Watch the webinar
You will:
Learn the basics of the SQL Join and the SAS DATA step merge.
Compare and contrast the different methods of combining data.
Learn that combining data is made even easier by using the built-in tools of our interfaces like the Query Utility in SAS Studio.
Leave with examples you can revisit and practice on your own.
The questions from the Q&A segment held at the end of the webinar are listed below and the slides from the webinar are attached.
Q&A
You mentioned Enterprise Guide had something similar to the SAS Studio Query Utility. Where is that found/what is it called?
That is also underneath your tasks for Enterprise Guide. It’s called a task in SAS Studio and it's also a task underneath Enterprise Guide. It’s on your left hand side of the EG screen and it's called the query builder.
Can you merge more than 2 tables?
Yes, if they contain the same common column(s). The BY statement must work for all of those tables that you're combining. In the case where we saw an example where we had a list of students and their health stats, their teachers, and the rooms, we couldn't join those three tables together because Name did not appear in all three tables. If I did have three different data sets that all contain the same student names, then I could list those tables on my merge statement. I can have as many tables as I want on my merge statement if they have the same merging criteria.
Do the formats for columns have to be the same in each table?
The data step is going to assign the attributes to your columns based on how they first appear in your DATA Step, and that goes into compilation, which is covered more heavily in SAS Programming I (I highly recommend that class). If your columns have different formats in each table it will adopt the formats of the first table. So, let's say I had some financial data and in one table, the first table that I mentioned on my merge statement, I've got the dollar format applied to them and then the second table I don't have any format applied to it, it will use that format that is first mentioned. Formats are just a way to change how our data is displayed, so it doesn't alter the underlying data. Formats can easily be changed in SAS so I wouldn't worry too much about that. The main thing that you want to be knowledgeable on is whether or not your columns have the same name, type, and length. Those are the three attributes that are really important.
What is the difference between outer union and using only union?
The difference is going to be the behavior of the columns and the rows. Union by default returns only unique rows. The Union set operator combines the results of your queries aligning columns based on their positions in the SELECT clauses of each query, then removes duplicate rows. It's another set operator that you can use to combine data, but its behavior is different behind the scenes. If you want a true concatenation, use outer union and use CORR option next to it.
Here is the documentation for set operators: https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/sqlproc/n0vo2lglyrnexwn14emi8m0jqvrj.htm
Does it matter which data set to call first when you do the inner join? For example from mylib.class_update inner join mylib.class_teachers vs from mylib.calss_teachers inner join mylib.class_update?
Whichever table you list first is going to be your left table. Second will be your right table. So the first one, it's on the left, second one is on the right, and so when you perform a left join or a right join, that's how you're referencing whether or not you want to include matches from either table. It's the order in which they are stated. You could either just flip the order of your tables if you wanted to get the opposite effect of including matching or non-matching rows. You can also just change the type of join from left to right.
Is there any difference to do: on f1.name=f2.name vs. on (f1.name=f2.name)?
There is no difference. You can use parentheses to organize your join criteria if you like.
I'm interested in taking the Base SAS licensing exam this year. Is everything in the two courses highlighted included in the exam? What other resources might be helpful to successfully become certified?
The Base Programming Specialist Certification Exam covers content from the SAS Programming 1 and SAS Programming 2 courses. A full list of content for this exam can be found here: https://www.sas.com/content/dam/SAS/documents/technical/certification/content-guide/specialist-base-programming.pdf
Visit https://www.sas.com/en_us/certification.html to see other SAS Certifications and to see what content they cover. You can view the recommended courses, Certification Prep Guides, sample questions, and practice exams as well.
Is there a link to the YouTube channel?
SAS How To Tutorials
Are the classes you teach free?
We offer many free resources to make our courses accessible. While live instructor-based classes often have fees, we do have the same courses available in our self-paced eLearning format. These are available through the SAS Learning Subscription which has a free 30-day trial, enough time to complete several courses including SAS Programming 1, SAS Programming 2, and SAS SQL Essentials which were mentioned in this webinar. We also have free tutorials available on the SAS Users page on YouTube. For other training options visit www.sas.com/training
Could you show us another example using point and click way to do SQL?
In our course Exploring SAS Studio for SAS Programmers We show an example of how to use the SAS Studio table viewer to generate PROC SQL code. This is another way, aside from the Query utility, that SAS Studio can be used in a point-and-click way to write SQL code.
How do I NOT get the two names column in the point and click option for the inner join?
The SAS Studio Query utility allows you to add/remove columns in the Columns section of the Settings pane. In our example, we added all columns from both tables, resulting in two columns named Name. Instead, you could remove one of the Name columns from the selected columns.
How do I use SAS utility to make join and filter?
The SAS Studio Query utility includes a filter option in the Settings pane under Columns > Filter.
Could you please show an example of Left/Right joins using data step?
You can find an example of a left join using the DATA Step Merge here: https://youtu.be/ksZH_LoXnKk?t=588s
How do you generate a pre filter before you import the huge data?
The PROC IMPORT step does not have a WHERE statement that can be used to filter data before it’s imported to the specified file type. However, if you’re working with a database, you can use the PROC SQL pass-through facility to push queries to your database that can include WHERE clauses filtering your results before they are returned to SAS. Here is a link to our documentation with syntax for the PROC SQL pass-through facility https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/lrcon/n1kbstf7vw0qcjn1ibfc8c78a9lc.htm
Is the SAS Studio session used in the SAS Certification review course exactly the same as the one used for the Pearson exam? The virtual session seemed different.
The SAS Base Programming Specialist exam includes SAS Studio 3.8 as a programming interface.
Would it be possible for you to show a quick example of how you would replicate a left join using the data step merge?
You can find an example of a left join using the DATA Step Merge here: https://youtu.be/ksZH_LoXnKk?t=588s
Recommended Resources
SAS Tutorial: Combining Data in SAS: DATA Step Versus SQL
SAS On Demand for Academics
SAS Documentation: Joining two tables using PROC SQL
SAS Documentation: DATA Step MERGE statement
Want more tips? Be sure to subscribe to the Ask the Expert board to receive follow up Q&A, slides and recordings from other SAS Ask the Expert webinars.
... View more