Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Remove records

Reply
New Contributor
Posts: 2

Remove records

Hello SAS Masters,

I have a question how to remove records from sas dataset like this:

ID DATE REASON AMOUNT 

A123 01302012 REJECT 100000

A123 01302012 REJECT 100000

A123 01302012 APPROVE 100000

A123 01312012 APPROVE 100000

B345 01312012 REJECT 400000

B345 01312012 APPROVE 400000

B345 01312012 APPROVE 300000

C567 01302012 REJECT 300000

C567 01302012 REJECT 500000

C567 01312012 REJECT 300000

D667 01302012 REJECT 300000

D667 01302012 REJECT 500000

D667 01302012 REJECT 300000

I need the output :

ID DATE REASON AMOUNT

A123 01302012 APPROVE 100000

A123 01312012 APPROVE 100000

B345 01312012 APPROVE 400000

B345 01312012 APPROVE 300000

C567 01302012 REJECT 500000

C567 01312012 REJECT 300000

D667 01302012 REJECT 300000

how to do this ?

Thanks in advance for your help.

RHS

Super User
Posts: 19,772

Re: Remove records

Posted in reply to rhsantoso

What's the logic?

New Contributor
Posts: 3

Re: Remove records

Posted in reply to rhsantoso

Since you do not list the selection criter, I have to guess from your output dataset. 

I got close to your output dataset. The following logic will have the one more record for D667 of 500000 amount.   Hope this help.

data appr reject;

   set file1;  /*** your inpurt data set ***/

   if reason='APPROVE'  then output appr;

   else output reject;

run;

proc sort data=reject out=reject2x(drop=date) nodupkey;

   by id amount;

run;

proc sort data=appr;

   by id amount;

run;

data reject2x(keep=id amount);

   merge reject2x(in=a) appr(in=b);

   by id amount;

   if a and b then delete;

run;
proc sort data=reject;

   by id amount;

run;

data reject;

    merge reject(in=a) reject2x(in=b);

   by id amount;

   if  a and b;

   if last.amount;

run;

data all;

       set appr reject;

run;

Frequent Contributor
Posts: 95

Re: Remove records

Posted in reply to rhsantoso

The way I understand the logic is as follows:

For a given ID and Date:

If there is no APPROVE record then output the last REJECT record.

Else, if there is one or more APPROVE records only output APPROVE records.

Zafer

Ex:

data x;

    input ID $ DATE anydtdte. REASON $ AMOUNT ;

    format Date Date9.;

datalines;

A123 01302012 REJECT 100000

A123 01302012 REJECT 100000

A123 01302012 APPROVE 100000

A123 01312012 APPROVE 100000

B345 01312012 REJECT 400000

B345 01312012 APPROVE 400000

B345 01312012 APPROVE 300000

C567 01302012 REJECT 300000

C567 01302012 REJECT 500000

C567 01312012 REJECT 300000

D667 01302012 REJECT 300000

D667 01302012 REJECT 500000

D667 01302012 REJECT 300000

;

run;

data want;

    Approved = 0;

    do until(last.Date);

        set x end=last;

        by ID Date;

        Approved + (REASON = 'APPROVE');

        if last.Date and NOT(Approved) then output;

        else if REASON = 'APPROVE' then Output;

    end;

    drop Approved;

run;

proc print; run;

New Contributor
Posts: 2

Re: Remove records

Hi All,

Thanks for the response and help.

The logic :

if the same id and date have more than 1 reject than the output only 1 reject,

if the same id and date have more than 1 approve than the output all approve,

if the same id and date have more than 1 reject and more than 1 approve then the output all approve.

RHS

Respected Advisor
Posts: 4,173

Re: Remove records

Posted in reply to rhsantoso

See below:

data have;
  input ID $ DATE anydtdte. REASON $ AMOUNT ;
  format Date Date9.;
datalines;
A123 01302012 REJECT 100000
A123 01302012 REJECT 100000
A123 01302012 APPROVE 100000
A123 01312012 APPROVE 100000
B345 01312012 REJECT 400000
B345 01312012 APPROVE 400000
B345 01312012 APPROVE 300000
C567 01302012 REJECT 300000
C567 01302012 REJECT 500000
C567 01312012 REJECT 300000
D667 01302012 REJECT 300000
D667 01302012 REJECT 500000
D667 01302012 REJECT 300000
;
run;

proc sort data=have;
  by ID Date Reason;
run;

data want;
  set have;
  by ID Date;
  if Reason='APPROVE' then output;
  else if first.date and Reason='REJECT' then output;
run;

Respected Advisor
Posts: 3,156

Re: Remove records

Posted in reply to rhsantoso

Just from input and output data sets, the implied rules can be many. It would help if  OP can spell out the rules explicitly, as ultimately it is OP's call.

Haikuo

Ask a Question
Discussion stats
  • 6 replies
  • 476 views
  • 0 likes
  • 6 in conversation