Calcite | Level 5

## How to subract the last date from the first date for each row

• I have a dataset similar to the one shown below, where I have transposed the data in chrongical order based on date. My dates will be different every time though.. since I will be running a query everyday so the dates will update. For each ID, I need to subtract the latest date( amount)- First date (amount)/ First date to find the percentage of change over that time period. SO for ID1 ( 138-0)/0 but I will need to set this to 100% for ID2 (149-145)/145.

data test;

input ID sasdate date9. amount;

format sasdate date9.;

datalines;

1 01FEB2014 123

1 01MAR2014 138

2 01JAN2014 145

2 01FEB2014 155

2 01MAR2014 149

3 01JAN2014 133

3 01FEB2014 132

3 01MAR2014 129

;

run;

proc sort data=test;

by id sasdate;

run;

proc sql;

select distinct cats('_',put(sasdate,date9.))

into :alldates separated by ' '

from test

order by sasdate;

quit;

%put &alldates;

proc transpose data=test out=outx1 (drop=_name_);

by id;

var amount;

id sasdate;

idlabel sasdate;

run;

data outx1;

retain id &alldates;

set outx1;

run;

proc print data=outx1 label noobs ;

title 'Transposed dates in chronological order';

run;

1 ACCEPTED SOLUTION

Accepted Solutions
PROC Star

## Re: How to subract the last date from the first date for each row

@Melika2 wrote:
This works! Thank you so much... what if I want to group by two class variables?

Say you use CLASS A B: .  Then proc summary would generate 4 types of observations (indicated by the _TYPE_ variable):

1. Each non-empty A*B cell.   _TYPE_=3

2. Each Class A value (summed over all B values)  - think row marginals - there will be N(a) of these, with _TYPE=1 or 2, I forget which.

3. Each Class B value (summed over all A values)  - think column marginals - there will be N(b) of these with  _TYPE_=2 or 1

4. Global  (one cell)   _TYPE_=0

The specific PROC SUMMARY feature used for this problem by @PaigeMiller was the MINID option, as in

``minid(sasdate(amount))=min_date_amount ``

which records the amount corresponding to the minimum sasdate  (within each CLASSification).  I had been unaware of MINID (or maybe I forgot - I can't remember) .

So not only did proc summary provide the minimum date via

``min(sasdate)=min_date``

allowing you to trivially determine whether a given ID has a starting date equivalent to the overall starting date, but you have the corresponding amount available when needed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
10 REPLIES 10
Diamond | Level 26

## Re: How to subract the last date from the first date for each row

... SO for ID1 ( 138-0)/0 ...

I don't understand this part. Shouldn't it be (138-123)/123 for ID1?

--
Paige Miller
Calcite | Level 5

## Re: How to subract the last date from the first date for each row

it should be the earliest date which is 01JAN2014 and since there was no amount for that.. it should be set to 0. The earliest date associated for the whole dataset not the particular group.

Diamond | Level 26

## Re: How to subract the last date from the first date for each row

``````proc summary data=test;
class id;
var sasdate amount;
output out=summary maxid(sasdate(amount))=max_date_amount minid(sasdate(amount))=min_date_amount
max(sasdate)=max_date min(sasdate)=min_date;
run;
data want;
if _n_=1 then set summary(where=(_type_=0) rename=(min_date=global_min_date) keep=min_date _type_);
set summary(where=(_type_=1));
if min_date=global_min_date then percent=(max_date_amount-min_date_amount)/min_date_amount;
else percent=1;
format percent percentn8.2;
drop _type_;
run;``````
--
Paige Miller
PROC Star

## Re: How to subract the last date from the first date for each row

Very nice. No pre-sorting required, and getting the global min date is integrated in use of the proc summary.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Diamond | Level 26

## Re: How to subract the last date from the first date for each row

Thanks!

More people should grasp the power of PROC SUMMARY. Hardly a day goes by when I don't use it (except weekends and holidays)

--
Paige Miller
Calcite | Level 5

## Re: How to subract the last date from the first date for each row

This works! Thank you so much... what if I want to group by two class variables?
PROC Star

## Re: How to subract the last date from the first date for each row

@Melika2 wrote:
This works! Thank you so much... what if I want to group by two class variables?

Say you use CLASS A B: .  Then proc summary would generate 4 types of observations (indicated by the _TYPE_ variable):

1. Each non-empty A*B cell.   _TYPE_=3

2. Each Class A value (summed over all B values)  - think row marginals - there will be N(a) of these, with _TYPE=1 or 2, I forget which.

3. Each Class B value (summed over all A values)  - think column marginals - there will be N(b) of these with  _TYPE_=2 or 1

4. Global  (one cell)   _TYPE_=0

The specific PROC SUMMARY feature used for this problem by @PaigeMiller was the MINID option, as in

``minid(sasdate(amount))=min_date_amount ``

which records the amount corresponding to the minimum sasdate  (within each CLASSification).  I had been unaware of MINID (or maybe I forgot - I can't remember) .

So not only did proc summary provide the minimum date via

``min(sasdate)=min_date``

allowing you to trivially determine whether a given ID has a starting date equivalent to the overall starting date, but you have the corresponding amount available when needed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Diamond | Level 26

## Re: How to subract the last date from the first date for each row

To do this for 2 ID variables, I think only two changes are needed.

``````proc summary data=test;
class id1 id2;
var sasdate amount;
output out=summary maxid(sasdate(amount))=max_date_amount minid(sasdate(amount))=min_date_amount
max(sasdate)=max_date min(sasdate)=min_date;
run;
data want;
if _n_=1 then set summary(where=(_type_=0) rename=(min_date=global_min_date) keep=min_date _type_);
set summary(where=(_type_=3));
if min_date=global_min_date then percent=(max_date_amount-min_date_amount)/min_date_amount;
else percent=1;
format percent percentn8.2;
drop _type_;
run;``````

I think if you have a large data set, you can make PROC SUMMARY more efficient by adding in the WAYS statement.

``ways 0 2;``

--
Paige Miller
Community Manager

## Re: How to subract the last date from the first date for each row

I think that @PaigeMiller and @mkeintz have forgotten more about PROC SUMMARY than I ever knew (or probably ever will know).  Talk about achieving the goal with the fewest PROCs / passes through the data!

SAS Hackathon registration is open! Build your skills. Make connections. Enjoy creative freedom. Maybe change the world.
PROC Star

## Re: How to subract the last date from the first date for each row

This is an example of why I try to convince programming colleagues to think of SAS as a tool first, a programming language second (and really it's the DATA step that I find is most like a language).

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Discussion stats
• 10 replies
• 902 views
• 3 likes
• 4 in conversation