SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
UshaLatha
Obsidian | Level 7

Hi All,

 

I need some help in optimizing my code.

I have two datasets, sample and history. Where the sample contains randomly selected rows from the history file.

Later I want to inner join the sample with history based on some condition.

 

Sample dataset:

loan_id month var1 var2 var3

1111  3   a1  b1 c1

2222  8   a2  b2 c2

3333  5   a3  b3 c3

4444  2   a4  b4 c4

 

History dataset:

loan_id month var1 var2 var3

1111  3   a1  b1 c1

1111  4   

1111  5  

2222  8   a2  b2 c2

2222  9  

2222  10  

2222  11  

3333  5   a3  b3 c3

3333  6 

3333  7 

3333  8 

3333  9 

4444  2   a4  b4 c4

4444  3  

4444  4   

4444  5  

4444  6  

 

output dataset:

1111  3  a1  b1 c1

1111  4  a1  b1 c1 

1111  5  a1  b1 c1 

2222  8  a2  b2 c2

2222  9  a2  b2 c2

2222  10 a2  b2 c2

2222  11 a2  b2 c2

3333  5  a3  b3 c3

3333  6  a3  b3 c3

3333  7  a3  b3 c3

3333  8  a3  b3 c3

3333  9  a3  b3 c3

4444  2  a4  b4 c4

4444  3  a4  b4 c4

4444  4  a4  b4 c4 

4444  5  a4  b4 c4

4444  6  a4  b4 c4

 

The main aim is to get all the month records from history dataset, but retain the values of var1, var2, var3 from the sample dataset itself. I have achieved this through proc sql. my code is below:

 

proc sql;

create table outds as

select a.loan_id, a.var1, a.var2, a.var3, b.month

from sample a, history b

where a.loan_id = b.loan_id

and a.month <= b.month

quit;

 

I want to optimize this, if there is any alternative way to do this in sas datastep or any other procedure in sas, as I am planning to do further analysis on the sample.

Please help me with this.

 

Thanks

 

 

 

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

 

data Sample;
input loan_id$ month var1$ var2$ var3$;
datalines;
1111 3 a1 b1 c1
2222 8 a2 b2 c2
3333 5 a3 b3 c3
4444 2 a4 b4 c4
;

data History;
input loan_id$ month var1$ var2$ var3$;
infile datalines truncover;
datalines;
1111 3 a1 b1 c1
1111 4
1111 5
2222 8 a2 b2 c2
2222 9 
2222 10
2222 11
3333 5 a3 b3 c3
3333 6
3333 7
3333 8
3333 9
4444 2 a4 b4 c4
4444 3
4444 4
4444 5
4444 6
;

proc sort data=sample; by loan_id; run;
proc sort data=History; by loan_id; run;

data outds;
	merge History(keep=loan_id month) Sample;
	by loan_id;
run;
UshaLatha
Obsidian | Level 7

Hi,

 

I forgot to add scenario where the starting month may not be same in history and sample. That was the reason I was joining on month (a.month <= b.month)

 

Also the history may have many more variables... but I want to retain only var1, var2, var3

Eg:

History dataset:

loan_id month var1 var2 var3 var4 var5 var6

1111  1 l r y u d a f

1111  2 f y e t s g

1111  3   a1  b1 c1 x y z

1111  4   p q r s t u

1111  5   e f g h i j

2222  7   k l m n o p

2222  8   a2  b2 c2 x y z

2222  9  

2222  10  

2222  11  

3333  5   a3  b3 c3 x y z

3333  6 

3333  7 

3333  8 

3333  9 

4444  1

4444  2   a4  b4 c4 x y z

4444  3  

4444  4   

4444  5  

4444  6  

 

The output should be:

loan_id month var1 var2 var3 var4 var5 var6

1111  3   a1  b1 c1 x y z

1111  4   a1  b1 c1 s t u

1111  5   a1  b1 c1 h i j

2222  7   a1  b1 c1 n o p

2222  8   a2  b2 c2 x y z

2222  9   a2  b2 c2

2222  10  a2  b2 c2

2222  11  a2  b2 c2

so on....

 

In your merge statement contains only loan_id. This may not give same results...

Kurt_Bremser
Super User

@PeterClemmensen's data step needs a slight modification:

data outds;
  merge
    History (in=a keep=loan_id month)
    Sample (in=b rename=(month=smonth))
  ;
  by loan_id;
  if a and b;
  if smonth <= month;
  drop smonth;
run;
UshaLatha
Obsidian | Level 7

Hi,

 

Since we are not using month in the by statement, few records are being dropped.

The counts are not matching with proc sql results.

Kurt_Bremser
Super User

Post your exact data in a data step (see @PeterClemmensen's example) and the sql you run against it. Use code posting windows, as described here, so that the code can easily be copy/pasted.

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1517 views
  • 1 like
  • 3 in conversation