BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi All,

I have a bit of a difficult problem that I thought someone else might be able to suggest some possible alternatives and post some code that might work for me.

Some background, I have 2 datasets. "A" is a list of purchases, "B" is a list of credits for purchases (ie product returns).

What I would like to be able to do is remove items from dataset "A" where it has a matching record in dataset "B".

Dataset A
Cust Prod Cost
ABC1 123X $57
ABC1 998C $107
ABC1 998C $107
ABC1 567D $500

Dataset B
Cust Prod Cost
ABC1 998C $107

Resulting Dataset
Cust Prod Cost
ABC1 123X $57
ABC1 998C $107
ABC1 567D $500

As you can see there are 2 records that could have been matched between "A" and "B", however only one was removed because there was only one record in "B".

If anyone has some thoughts how this can be approached in SAS 9.1 that would be greatly appreciated.

Thanks in advance
DW Changed the subject to indicate that this had been solved.


Message was edited by: DWW
5 REPLIES 5
andreas_lds
Jade | Level 19
Hi DWW,

try using the IN-option when merging the dataset. Both datasets have to be sorted by Cust, Prod and Cost.

Example
[pre]
data work.c;
merge work.a
work.b(in= inB)
;
by Cust Prod Cost;

if inB then delete;
run;
[/pre]
ieva
Pyrite | Level 9
Hi,

This code could help, but only if in the dataset B there are no repeated observations.


proc sort data=a ;
by cust prod cost;
run;

proc sort data=b;
by cust prod cost;
run;

data c;
merge a b (in=b);
by cust prod cost;
if first.prod and first.cost and b then delete;
run;
ieva
Pyrite | Level 9
Changed the code a bit: you can add one extra variable that counts how many times the same situation is repeated. So even if it is repeated also in b, this should delete only those observations that match and not all of them:

proc sort data=a ;
by cust prod cost;
run;

proc sort data=b;
by cust prod cost;
run;

data a1;
set a;
by cust prod cost;
if first.cost then count=0;
count+1;
run;

data b1;
set b;
by cust prod cost;
if first.cost then count=0;
count+1;
run;


data c (drop = count);
merge a1 b1 (in=b);
by cust prod cost count;
if b then delete;
run;
deleted_user
Not applicable
The SQL set operator EXCEPT comes to mind.
However, a recommendation would be based on more information like: SAS platform, data volumes and frequency of update, the use/re-use of the information,
HTH
PeterC
deleted_user
Not applicable
Hi all, thanks for the great and fast responses. Ieva your second solution with the counts, worked an absolute treat. Thanks so much.

Regards
DW

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 617 views
  • 0 likes
  • 3 in conversation