BookmarkSubscribeRSS Feed
elsfy
Quartz | Level 8

I have this code : 

data want ; update have result ; /* updates table have with table result values */ 
by var ; 
run ; 


i would like to know the equivalent way to do this with proc SQL in order to avoid the proc sort step before the data step. 

2 REPLIES 2
PaigeMiller
Diamond | Level 26

Surely you realize that SQL still has to do a sort in order to do the update. You just don't have to program it in SAS.

--
Paige Miller
mkeintz
PROC Star

I agree with @PaigeMiller that SQL needs to do a sort under the hood.  It may be a bit more efficient than PROC SORT because the sorting only generates utility files (or maybe even just memory objects) rather than making a complete SAS dataset file.

 

But consider the far more wordy code needed to replicate the UPDATE statement, which only replaces a value from HAVE with a value from RESULT when the new value is not missing.  As far as I know, this means coding explicit COALESCE (or COALESCEC) functions for each variable that is present in both datasets.  For example, when you have a one-to-one join, you would need to do something like this:

 

data HAVE;
input id  a b c;
datalines;
1  11 111 1111
2  .  222 2222
3  33  .  3333
run;
data RESULT;
input id a b c ;
datalines;
1 -11 -111 -1111 
2 -22 -222 -2222
3  .  -333 -3333
run;

proc sql;
  create table ljoin as
  select coalesce(h.a,r.a) as a, coalesce(h.b,r.b) as b, coalesce(h.c,r.c) as c
  from HAVE as h left join RESULT as r on h.id=r.id;
quit;

 

And the above wouldn't replicate UPDATE for a one-to-many join (UPDATE would generate only 1 record per by-group).  I leave it to those more expert in SQL to show how that would be done.

 

The point is there is far more coding required to avoid a preliminary PROC SORT when replicating UPDATE.  I'm not even sure it can be done, because the UPDATE statement depends on the within-bygroup order of incoming RESULT records.  SQL makes no commitment to honor such record order.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 328 views
  • 2 likes
  • 3 in conversation