BookmarkSubscribeRSS Feed
Sandeep77
Lapis Lazuli | Level 10

Hi Experts,

I am running the below code where I am added tran_code details to a imported file (June_exception). The codes looks alright and simple but it is running since the last 3 hours. Is anything I am missing in the code which is taking long to show the results?


proc sql;
create table initiative_release as
select a.*,
b.tran_code
from June_Exceptions as a
Inner join
p2scflow.debt_trans as b on a.debt_code=b.debt_code;
quit;

 

Please check and reply. Thank you

3 REPLIES 3
Reeza
Super User
How big are the files? Are they both SAS data sets or is one on a server?
ballardw
Super User

I have a strong suspicion that your are getting way more output records than you expect. Consider this small example doing an inner join on two data sets with 5 records each.

data a;
   input code value1;
datalines;
1  1
1  2
1  3
2  11
2  22
;

data b;
   input code value2;
datalines;
1  15
1  25
1  35
2  111
2  222
;

proc sql;
   create table example as
   select a.*, b.value2
   from a 
        inner join
        b
        on a.code=b.code
   ;
quit;

There are not 5 output records there are 13 because of multiple values of the code variable in each set. 3 values of 1 in each means there are 9 output records for code=1. So if you have any duplicates of debt_code in any set you get multiple output records for each match. With n duplicates of the the code value in one set and m duplicates in the second set you get n * m records for each code.

 

You may want to run proc freq on the debt_code variable in your two sets and consider just how many matches that will be.

 

Kurt_Bremser
Super User

Get to know your data (Maxim 3).

  • How many observations are in both datasets
  • Are there multiples of the keys in both datasets (leading to a cartesian join)
  • What are the variable attributes (should COMPRESS be used)

Since you use single level dataset names, all I/O concentrates on WORK, which can cause lots of latencies on spinning metal disks. So the physical structure of your available storage may allow spreading the load.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 567 views
  • 0 likes
  • 4 in conversation