BookmarkSubscribeRSS Feed
elsfy
Quartz | Level 8

Hi,

 

I did this with proc sql and it works fine.

 


proc sql ;
update TABLE as a 

set VAR = case when a.VAR ^= . then a.VAR else (select VAR from TABLE2 as b where a.year = b.year and a.id = b.id) end
;
quit

how can i do it in a data step ? Especially specifying condition with VAR from TABLE,..

proc sql is very slow

2 REPLIES 2
HB
Barite | Level 11 HB
Barite | Level 11

Without more information:


data a;
  input year id myvar $1.;
  datalines;
1999 1 a
2000 1 b
1962 2 c
2001 2 d
2002 3 .
2002 3 f
2003 4 .
2003 4 g
2204 5 h
2205 6 i
;
run;

data b;
  input year id myvar $1.;
  datalines;
2002 3 x
2002 3 f
2003 4 y
2003 4 g
2206 8 j
2206 8 k
;
run;

proc sort data=a;
	by year id;
run;

proc sort data=b;
	by year id;
run;
 
data combined;
	merge a (in=z) b;
	by year id;
	if z;
run;

Yields

 

year id myvar
1962 2 c
1999 1 a
2000 1 b
2001 2 d
2002 3 x
2002 3 f
2003 4 y
2003 4 g
2204 5 h
2205 6 i

 

But you may really want coalesce or something else.

 

 

ballardw
Super User

In a data step if want related records, such as a matching ID variable, to be used then you would MERGE by the matching variable(s). Which will typically involve sorting both data sets by those variables first.

 

The behavior of Merge for same named variable that are not the by variables is that one value will overwrite others. Which exactly depends on order sets on the MERGE statement. Generally the rightmost ( or last) data set mentioned on the merge statement will contribute the value. Merge however does not work well if the matching variables have multiples in both data sets.

If need both of the same named variable values for other than replacement then you RENAME the variable in a data set option for one (or more) of the data sets. Then use appropriate named variable for your purpose.

 

If the records in the 'base' data set, the one you are updating, are uniquely identified (meaning no duplicates) of the key variables you can use the UPDATE instead of Merge which has some options that behave differently than merge would. Such as all of the values of 'transaction' set, the one with the new values, all update the same record. Merge with multiple records for the By variable would have multiple output records.

 

But example of what you actually are working with and desired output may be needed as the behavior of subqueries is not directly available with merge and may require Hash.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 278 views
  • 0 likes
  • 3 in conversation