DATA Step, Macro, Functions and more

Extract all Observations after Latest Amount Spent=X (inclusive)

Reply
Frequent Contributor
Posts: 75

Extract all Observations after Latest Amount Spent=X (inclusive)

Using SAS 7.

 

I am quite new to SAS.

 

Please see attached file. I have a similar dateset.

 

For each unique name I want to extract all the records from the latest Amount Spent=X onwards, inclusive of the observation with the latest Amount Spent=X

Eg

John 20160521 6
John 20160101 50
John 20150526 X

 

How do I do so?

Thank you

 

 

Super User
Super User
Posts: 7,955

Re: Extract all Observations after Latest Amount Spent=X (inclusive)

Hi,

 

Its a good idea to put test data as a datastep rather than attaching files.  So this is just off the top of my head:

data want;

  set have;

  retain o;

  by name;

  if first.name then o=0;  /* o is a flag 0=not output, 1 to output */

  if value="X" then o=1;

  if o then output;

run;

Frequent Contributor
Posts: 75

Re: Extract all Observations after Latest Amount Spent=X (inclusive)

Hi

 

Thank you for the reply.

 

But I don't think you looked at the file.

 

Example Dataset

Name Date Amount Spent
John 20160521 6
John 20160101 50
John 20150526 X
John 20150329 23
John 20150215 X
John 20141212 10
Peter 20151203 150
Peter 20150321 X
Peter 20141123 20
Albert 20150408 X

 

The output that I want

Name Date Amount Spent
John 20160521 6
John 20160101 50
John 20150526 X
Peter 20151203 150
Peter 20150321 X
Albert 20150408 X

 

The code you provided  will only extract the observations with X, and after X.

 

Thank you

Super User
Super User
Posts: 7,955

Re: Extract all Observations after Latest Amount Spent=X (inclusive)

Well, minor change to the code then (and no, I dont look at attachments - post test data in the form of a datastep):

data want;
  set have;
  retain o;
  by name;
  if first.name then o=1; 
  if value="X" then o=0;  /* If X is encountered, stop outputting */
  if o then output;
run;
Trusted Advisor
Posts: 1,117

Re: Extract all Observations after Latest Amount Spent=X (inclusive)

[ Edited ]

Hi @apple,

 

Here is another approach:

/* Create test data, incl. a name without Amount_Spent=.X */

data have;
missing x;
input Name $ Date :yymmdd. Amount_Spent;
format date yymmddn8.;
cards;
John   20160521   6
John   20160101  50
John   20150526   X
John   20150329  23
John   20150215   X
John   20141212  10
Peter  20151203 150
Peter  20150321   X
Peter  20141123  20
Albert 20150408   X
Jack   20141024  50
;

/* Add sequential number in case that sort order is to be preserved */

data temp;
set have;
seqno=_n_;
run;

proc sort data=temp;
by name descending date;
run;

/* Create target dataset up to sort order */

data want0;
do until(last.name);
  set temp;
  by name;
  if amount_spent=.X then flagX=1;
end;
do until(last.name);
  set temp;
  by name descending date;
  if first.name then flag=1;
  if flagX & flag then output;
  if amount_spent=.X then flag=0;
end;
drop flag:;
run;

proc sort data=want0 out=want(drop=seqno);
by seqno;
run;

proc print data=want;
run;

Edit: Please note that I interpreted your "X" values as special missing values. If I were you, I would strongly prefer to define Amount_Spent as a numeric variable, because it contains numbers, not "codes" consisting of digits. Character variables are not suitable for doing calculations.

 

Is there a possibility that one person has spent two amounts on the same date, maybe one .X and one non-missing? What would be the selection criterion in this case?

 

Edit 2: Simplified definition of variable FLAG (if first.name ...).

Valued Guide
Posts: 860

Re: Extract all Observations after Latest Amount Spent=X (inclusive)

[ Edited ]

data have;
infile cards dsd;
informat name $6. date yymmdd8. amount_spent 8.;
format name $6. date yymmddn8. amount_spent 8.;
input Name Date Amount_Spent;
cards;
John,20160521,6
John,20160101,50
John,20150526,.
John,20150329,23
John,20150215,.
John,20141212,10
Peter,20151203,150
Peter,20150321,.
Peter,20141123,20
Albert,20150408,.
;run;

proc sql;
create table want as
select *
from have where date >= (select max(date)
from have
where missing(amount_spent));

Ask a Question
Discussion stats
  • 5 replies
  • 279 views
  • 1 like
  • 4 in conversation