Hello SAS Masters,
I have a question how to remove records from sas dataset like this:
ID DATE REASON AMOUNT
A123 01302012 REJECT 100000
A123 01302012 REJECT 100000
A123 01302012 APPROVE 100000
A123 01312012 APPROVE 100000
B345 01312012 REJECT 400000
B345 01312012 APPROVE 400000
B345 01312012 APPROVE 300000
C567 01302012 REJECT 300000
C567 01302012 REJECT 500000
C567 01312012 REJECT 300000
D667 01302012 REJECT 300000
D667 01302012 REJECT 500000
D667 01302012 REJECT 300000
I need the output :
ID DATE REASON AMOUNT
A123 01302012 APPROVE 100000
A123 01312012 APPROVE 100000
B345 01312012 APPROVE 400000
B345 01312012 APPROVE 300000
C567 01302012 REJECT 500000
C567 01312012 REJECT 300000
D667 01302012 REJECT 300000
how to do this ?
Thanks in advance for your help.
RHS
What's the logic?
Since you do not list the selection criter, I have to guess from your output dataset.
I got close to your output dataset. The following logic will have the one more record for D667 of 500000 amount. Hope this help.
data appr reject;
set file1; /*** your inpurt data set ***/
if reason='APPROVE' then output appr;
else output reject;
run;
proc sort data=reject out=reject2x(drop=date) nodupkey;
by id amount;
run;
proc sort data=appr;
by id amount;
run;
data reject2x(keep=id amount);
merge reject2x(in=a) appr(in=b);
by id amount;
if a and b then delete;
run;
proc sort data=reject;
by id amount;
run;
data reject;
merge reject(in=a) reject2x(in=b);
by id amount;
if a and b;
if last.amount;
run;
data all;
set appr reject;
run;
The way I understand the logic is as follows:
For a given ID and Date:
If there is no APPROVE record then output the last REJECT record.
Else, if there is one or more APPROVE records only output APPROVE records.
Zafer
Ex:
data x;
input ID $ DATE anydtdte. REASON $ AMOUNT ;
format Date Date9.;
datalines;
A123 01302012 REJECT 100000
A123 01302012 REJECT 100000
A123 01302012 APPROVE 100000
A123 01312012 APPROVE 100000
B345 01312012 REJECT 400000
B345 01312012 APPROVE 400000
B345 01312012 APPROVE 300000
C567 01302012 REJECT 300000
C567 01302012 REJECT 500000
C567 01312012 REJECT 300000
D667 01302012 REJECT 300000
D667 01302012 REJECT 500000
D667 01302012 REJECT 300000
;
run;
data want;
Approved = 0;
do until(last.Date);
set x end=last;
by ID Date;
Approved + (REASON = 'APPROVE');
if last.Date and NOT(Approved) then output;
else if REASON = 'APPROVE' then Output;
end;
drop Approved;
run;
proc print; run;
Hi All,
Thanks for the response and help.
The logic :
if the same id and date have more than 1 reject than the output only 1 reject,
if the same id and date have more than 1 approve than the output all approve,
if the same id and date have more than 1 reject and more than 1 approve then the output all approve.
RHS
See below:
data have;
input ID $ DATE anydtdte. REASON $ AMOUNT ;
format Date Date9.;
datalines;
A123 01302012 REJECT 100000
A123 01302012 REJECT 100000
A123 01302012 APPROVE 100000
A123 01312012 APPROVE 100000
B345 01312012 REJECT 400000
B345 01312012 APPROVE 400000
B345 01312012 APPROVE 300000
C567 01302012 REJECT 300000
C567 01302012 REJECT 500000
C567 01312012 REJECT 300000
D667 01302012 REJECT 300000
D667 01302012 REJECT 500000
D667 01302012 REJECT 300000
;
run;
proc sort data=have;
by ID Date Reason;
run;
data want;
set have;
by ID Date;
if Reason='APPROVE' then output;
else if first.date and Reason='REJECT' then output;
run;
Just from input and output data sets, the implied rules can be many. It would help if OP can spell out the rules explicitly, as ultimately it is OP's call.
Haikuo
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.