DATA Step, Macro, Functions and more

proc sql update

Reply
Contributor
Posts: 37

proc sql update

I can not understand why the next proc sql update values missing "var1" at c1 = 2 and c2 = 2 (update key). The resulting value is missing. I expected it to remain unchanged at 20

 

data A;
input c1 c2 var1 var2;
datalines;
1 1 12 32
1 2 32 21
2 1 12 98
2 2 20 30
2 3 11 11
3 1 21 22
4 1 31 32
2 4 22 22
6 1 77 77

;

data tmp;
input c1 c2 var1 var2;
datalines;
1 1 999 999
1 2 888 999
2 1 999 999
;
 
proc sql;
update  a 
   set var1=(select var1 from tmp as b
		         	where a.c1=b.c1 and a.c2=b.c2)

	where a.c1 in (select c1 from  tmp) and
          a.c2 in (select c2 from  tmp) 
		  ;

quit;
Super User
Posts: 7,868

Re: proc sql update

Posted in reply to mario_pellegrini

update in SQL does not behave like update in a data step. Since your subquery produces a missing value, that missing value is set. Use the coalesce() function to keep the original value if the subquery yields no result.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 37

Re: proc sql update

Posted in reply to mario_pellegrini

I do not understand, which missing value produces the subquery? at c1 = 1, c2 = 1 --- c1 = 1, c2 = 2 --- c1 = 2, c2 = 1 there are 3 lines corresponding to the values 12,32 - 32,21 - 12 98

Super User
Posts: 7,868

Re: proc sql update

Posted in reply to mario_pellegrini

There is no observation in dataset tmp with c1=2 AND c2=2, so the subquery can't find a value for that combination, which is present in observation 4 of dataset A, where the value 20 is overwritten with the missing value from the subquery.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 37

Re: proc sql update

Posted in reply to KurtBremser

ok, but all other combinations do not have values in the sub query (example for c1 = 3, c2 = 1). In these cases, however, the values of dataset A remain unchanged. Only for c1 = 2, c2 = 2 is updated to missing the value of var1

Super User
Posts: 7,868

Re: proc sql update

Posted in reply to mario_pellegrini

Because your where condition at the end of the query:

where a.c1 in (select c1 from  tmp) and
          a.c2 in (select c2 from  tmp) 

prevents the update from even touching those. But you do have at least one c1=2 and one c2=2 in tmp, so the combination of c1=2 and c2=2 is updated.

If you want to prevent this from happening, you best create a compound key (ie c3 = c1*100 + c2) in both datasets and join on that.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 37

Re: proc sql update

Posted in reply to KurtBremser

ok I understand, thank you. but this would slow down the processing if it were large databases. Is not there a different method?

Super User
Posts: 7,868

Re: proc sql update

Posted in reply to mario_pellegrini

When I know that I will need a specific derived value later, I set it when such a dataset is created.

 

Intelligent data makes for intelligent programs.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 7 replies
  • 106 views
  • 0 likes
  • 2 in conversation