BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Melika2
Calcite | Level 5
  • I have a dataset similar to the one shown below, where I have transposed the data in chrongical order based on date. My dates will be different every time though.. since I will be running a query everyday so the dates will update. For each ID, I need to subtract the latest date( amount)- First date (amount)/ First date to find the percentage of change over that time period. SO for ID1 ( 138-0)/0 but I will need to set this to 100% for ID2 (149-145)/145.

 

 

data test;

input ID sasdate date9. amount;

format sasdate date9.;

datalines;

 

1 01FEB2014 123

1 01MAR2014 138

2 01JAN2014 145

2 01FEB2014 155

2 01MAR2014 149

3 01JAN2014 133

3 01FEB2014 132

3 01MAR2014 129

;

run;

 

 

proc sort data=test;

by id sasdate;

run;

 

 

proc sql;

select distinct cats('_',put(sasdate,date9.))

into :alldates separated by ' '

from test

order by sasdate;

quit;

 

 

%put &alldates;

 

 

proc transpose data=test out=outx1 (drop=_name_);

by id;

var amount;

id sasdate;

idlabel sasdate;

run;

 

 

data outx1;

retain id &alldates;

set outx1;

run;

 

 

proc print data=outx1 label noobs ;

title 'Transposed dates in chronological order';

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

@Melika2 wrote:
This works! Thank you so much... what if I want to group by two class variables?

Say you use CLASS A B: .  Then proc summary would generate 4 types of observations (indicated by the _TYPE_ variable):

  1. Each non-empty A*B cell.   _TYPE_=3

  2. Each Class A value (summed over all B values)  - think row marginals - there will be N(a) of these, with _TYPE=1 or 2, I forget which.

  3. Each Class B value (summed over all A values)  - think column marginals - there will be N(b) of these with  _TYPE_=2 or 1

  4. Global  (one cell)   _TYPE_=0

 

The specific PROC SUMMARY feature used for this problem by @PaigeMiller was the MINID option, as in

minid(sasdate(amount))=min_date_amount 

which records the amount corresponding to the minimum sasdate  (within each CLASSification).  I had been unaware of MINID (or maybe I forgot - I can't remember) .

 

So not only did proc summary provide the minimum date via

min(sasdate)=min_date

allowing you to trivially determine whether a given ID has a starting date equivalent to the overall starting date, but you have the corresponding amount available when needed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

... SO for ID1 ( 138-0)/0 ...

 

I don't understand this part. Shouldn't it be (138-123)/123 for ID1?

--
Paige Miller
Melika2
Calcite | Level 5

it should be the earliest date which is 01JAN2014 and since there was no amount for that.. it should be set to 0. The earliest date associated for the whole dataset not the particular group.

PaigeMiller
Diamond | Level 26
proc summary data=test;
    class id;
    var sasdate amount;
    output out=summary maxid(sasdate(amount))=max_date_amount minid(sasdate(amount))=min_date_amount 
        max(sasdate)=max_date min(sasdate)=min_date;
run;
data want;
    if _n_=1 then set summary(where=(_type_=0) rename=(min_date=global_min_date) keep=min_date _type_);
    set summary(where=(_type_=1));
    if min_date=global_min_date then percent=(max_date_amount-min_date_amount)/min_date_amount;
    else percent=1;
    format percent percentn8.2;
    drop _type_;
run;
--
Paige Miller
mkeintz
PROC Star
Very nice. No pre-sorting required, and getting the global min date is integrated in use of the proc summary.
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

Thanks!

 

More people should grasp the power of PROC SUMMARY. Hardly a day goes by when I don't use it (except weekends and holidays)

--
Paige Miller
Melika2
Calcite | Level 5
This works! Thank you so much... what if I want to group by two class variables?
mkeintz
PROC Star

@Melika2 wrote:
This works! Thank you so much... what if I want to group by two class variables?

Say you use CLASS A B: .  Then proc summary would generate 4 types of observations (indicated by the _TYPE_ variable):

  1. Each non-empty A*B cell.   _TYPE_=3

  2. Each Class A value (summed over all B values)  - think row marginals - there will be N(a) of these, with _TYPE=1 or 2, I forget which.

  3. Each Class B value (summed over all A values)  - think column marginals - there will be N(b) of these with  _TYPE_=2 or 1

  4. Global  (one cell)   _TYPE_=0

 

The specific PROC SUMMARY feature used for this problem by @PaigeMiller was the MINID option, as in

minid(sasdate(amount))=min_date_amount 

which records the amount corresponding to the minimum sasdate  (within each CLASSification).  I had been unaware of MINID (or maybe I forgot - I can't remember) .

 

So not only did proc summary provide the minimum date via

min(sasdate)=min_date

allowing you to trivially determine whether a given ID has a starting date equivalent to the overall starting date, but you have the corresponding amount available when needed.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
PaigeMiller
Diamond | Level 26

To do this for 2 ID variables, I think only two changes are needed.

 

proc summary data=test;
    class id1 id2;
    var sasdate amount;
    output out=summary maxid(sasdate(amount))=max_date_amount minid(sasdate(amount))=min_date_amount 
        max(sasdate)=max_date min(sasdate)=min_date;
run;
data want;
    if _n_=1 then set summary(where=(_type_=0) rename=(min_date=global_min_date) keep=min_date _type_);
    set summary(where=(_type_=3));
    if min_date=global_min_date then percent=(max_date_amount-min_date_amount)/min_date_amount;
    else percent=1;
    format percent percentn8.2;
    drop _type_;
run;

I think if you have a large data set, you can make PROC SUMMARY more efficient by adding in the WAYS statement.

ways 0 2;

 

--
Paige Miller
ChrisHemedinger
Community Manager

I think that @PaigeMiller and @mkeintz have forgotten more about PROC SUMMARY than I ever knew (or probably ever will know).  Talk about achieving the goal with the fewest PROCs / passes through the data!

Learn from the Experts! Check out the huge catalog of free sessions in the Ask the Expert webinar series.
mkeintz
PROC Star

This is an example of why I try to convince programming colleagues to think of SAS as a tool first, a programming language second (and really it's the DATA step that I find is most like a language).   

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1109 views
  • 3 likes
  • 4 in conversation