BookmarkSubscribeRSS Feed
Pandu2
Obsidian | Level 7
Thankyou it worked out really well.
Tom
Super User Tom
Super User

You will not get missing AMOUNT if you use either SUM(AMOUNT,0) or COALESE(AMOUNT,0).  Cannot happen.

If you used select * in PROC SQL then the first variable named AMOUNT that it sees will be the one saved into the dataset.  So if you used

select x.*,coalesce(y.amount,0) as amount

And there was already a variable named AMOUNT in the X dataset then that is the one you see and the result of the COALESCE() function call will not make it into the dataset.

 

Show the code you used.

Make a simplified example that demonstrates the issue.  Share it with us using simple data steps to create the two datasets.

 

Pandu2
Obsidian | Level 7
Infact the amount column should have 0 value only if the values of x table doesn't exist in y table. Otherwise the amount column values shouldn't be changed. Thanks.
Tom
Super User Tom
Super User

@Pandu2 wrote:
Infact the amount column should have 0 value only if the values of x table doesn't exist in y table. Otherwise the amount column values shouldn't be changed. Thanks.

So another reason to not wasting your time trying to shoehorn this into SQL code.

data want;
  merge x y(in=iny);
  by a b c ;
  if not iny then amount=0;
run;
Sajid01
Meteorite | Level 14

Hello @Pandu2 

You have the two following requirements.

These two tables should be joined based on those 3 fields named a,b,c and it should have a condition like
If the values of the 3 fields in x doesn't exist in the columns of y table then for those values the amount should be equal to 0 instead of excluding them those
who doesn't exist.

While there could be many approaches to solve this problem,
one solution for this is to use a full outer join.
Proc SQL or data step merge cam be used.
As you are using three columns to base your join upon,
one approach is to create a composite key. I have used a simple approach of concatenating the three column values.
More sophisticated approaches can be used.
Your second table i.e., y has missing values for amount, it is better to have them as zero.
(you may need them in arithmetic operations. Using missing values in an expression generates missing values. This can be avoided by having them as zeros rather than keep them missing).
I suggest the following code. Modify to suit your needs.

data x;
input (a b c) (:$4.);
datalines;
1 5 15
2 6 26
3 7 37
4 8 48
;

data y;
infile datalines truncover;
input (a b c) (:$4.) amount :10.;
datalines;
1 5 15 4
2 6 26 5
3 7 36 9
4 8 48
;
data left;
retain key a b c;
set x;
key=catx('-',a,b,c);
run;

data right;
retain key a b c;
set y;
key=catx('-',a,b,c);
run;
proc sort data=left;
by key;
run;
proc sort data=right;
by key;
run;
data want (drop=key);
merge left right;
by key;
amount=coalesce(amount,0);
run;

The output will be like this

Sajid01_0-1650898350787.png

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 19 replies
  • 1146 views
  • 0 likes
  • 6 in conversation