Programming the statistical procedures from SAS

Collapsing data using (arrays)

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 117
Accepted Solution

Collapsing data using (arrays)

Hi all

Could anyone please help me with this:

I have a data set that looks like this:

                 

Person EarningDate
1122001-01-23
1342001-02-03
1232001-02-25
2452001-03-04
2322001-03-27
372001-04-04
3342001-05-05
3652001-05-26
3332001-06-06
3232001-07-02
4332001-02-03
4122001-03-05

What I am trying to do is the following

PersonSum EarningEarning Period
1total sum for person 1Last earning date - First earning day person 1
2total sum for person 2Last earning date - First earning day person 2
3total sum for person 3Last earning date - First earning day person 3
4total sum for person 4Last earning date - First earning day person 4

I a guess this would be done through arrays but I am really clueless....

Could anyone please help me with the code??

Kind regards

ammarhm


Accepted Solutions
Solution
‎09-08-2014 04:52 PM
Regular Contributor
Posts: 233

Re: Collapsing data using (arrays)

I hope this is what you need as the output.

Capture.JPG

data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

proc sql;
create table want1 as
select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period
from have
group by person;
quit;

proc sql;
create table want2 as
select distinct person, earning as last_earning
from have
group by person
having date=max(date);
quit;

data want;
merge want1(in=a) want2(in=b);
by person;
if a=b;
run;

View solution in original post


All Replies
Respected Advisor
Posts: 3,773

Re: Collapsing data using (arrays)

You don't need arrays.

data person;
   infile cards4 dsd firstobs=2;
  
input person earning date :mmddyy.;
  
format date mmddyy.;
  
cards;
Person,Earning,Date
1,12,1/23/2001
1,34,2/3/2001
1,23,2/25/2001
2,45,3/4/2001
2,32,3/27/2001
3,7,4/4/2001
3,34,5/5/2001
3,65,5/26/2001
3,33,6/6/2001
3,23,7/2/2001
4,33,2/3/2001
4,12,3/5/2001
;;;;
   run;
proc print;
  
run;
proc summary data=person nway;
  
class person;
   output out=summary sum(earning)=earning min(date)=min max(date)=max;
  
run;
data summary;
   set summary;
   period = max-min;
  
run;
proc print;
  
run;
Frequent Contributor
Posts: 117

Re: Collapsing data using (arrays)

Great one data_null

Much appreciated

Frequent Contributor
Posts: 117

Re: Collapsing data using (arrays)

Just another question

Would you be so kind and tell me how to add another column to the output dataset showing the last earning per person?

Kind regards

Regular Contributor
Posts: 233

Re: Collapsing data using (arrays)

To add the last earning date column


data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

Edited:

proc sql;

create table want1 as

select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period,

max(date) as max_date format yymmdd10.

from have

group by person;

quit;

Attachment
Frequent Contributor
Posts: 117

Re: Collapsing data using (arrays)

Thanks Hima

I was looking into having the last earning (not the last earning date)

Any suggestions? The proc sql is a great option

Kind regards

Regular Contributor
Posts: 233

Re: Collapsing data using (arrays)

You mean the last earning for person 1 is  23 person 2 is 32 etc...

Frequent Contributor
Posts: 117

Re: Collapsing data using (arrays)

Yes exactly, thanks in advance

Frequent Contributor
Posts: 117

Re: Collapsing data using (arrays)

The real reason is that I want to modify the table so I would get the sum of all earnings per person excluding the last one, so I thought I might do a sum(earning)-last earning

you get the point, i am sure

Solution
‎09-08-2014 04:52 PM
Regular Contributor
Posts: 233

Re: Collapsing data using (arrays)

I hope this is what you need as the output.

Capture.JPG

data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

proc sql;
create table want1 as
select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period
from have
group by person;
quit;

proc sql;
create table want2 as
select distinct person, earning as last_earning
from have
group by person
having date=max(date);
quit;

data want;
merge want1(in=a) want2(in=b);
by person;
if a=b;
run;

Respected Advisor
Posts: 3,773

Re: Collapsing data using (arrays)

Yes that can be done with PROC SUMMARY IDGROUP.

data person;
   infile cards4 dsd firstobs=2;
  
input person earning date :mmddyy.;
  
format date mmddyy.;
  
cards;
Person,Earning,Date
1,12,1/23/2001
1,34,2/3/2001
1,23,2/25/2001
2,45,3/4/2001
2,32,3/27/2001
3,7,4/4/2001
3,34,5/5/2001
3,65,5/26/2001
3,33,6/6/2001
3,23,7/2/2001
4,33,2/3/2001
4,12,3/5/2001
;;;;
   run;
proc print;
  
run;
proc summary data=person nway;
  
class person;
   output out=summary(drop=_Smiley Happy sum(earning)=earning min(date)=min
     
idgroup(max(date) out(date earning)=max lastearning);
   run;
data summary;
   set summary;
   period = max-min;
  
run;
proc print;
  
run;

9-8-2014 9-00-58 PM.png
Grand Advisor
Posts: 9,463

Re: Collapsing data using (arrays)

NULL ,

You can use RANGE as well.No need to overwrite the table .

  proc summary data=person nway; 
   class person;
   output out=summary sum(earning)= range(date)= /autoname; 
   run; 

Xia Keshan

Respected Advisor
Posts: 3,773

Re: Collapsing data using (arrays)

Good point.  I have modified.

data person;
   infile cards4 dsd firstobs=2;
  
input person earning date :mmddyy.;
  
format date mmddyy.;
  
cards;
Person,Earning,Date
1,12,1/23/2001
1,34,2/3/2001
1,23,2/25/2001
2,45,3/4/2001
2,32,3/27/2001
3,7,4/4/2001
3,34,5/5/2001
3,65,5/26/2001
3,33,6/6/2001
3,23,7/2/2001
4,33,2/3/2001
4,12,3/5/2001
;;;;
   run;
proc print;
  
run;
proc summary data=person nway;
  
class person;
   output out=summary(drop=_Smiley Happy sum(earning)=earning min(date)=min range(date)=period
     
idgroup(max(date) out(date earning)=max lastearning);
   format period: ;
   run;
proc print;
  
run;
9-9-2014 8-21-50 AM.png


Trusted Advisor
Posts: 1,195

Re: Collapsing data using (arrays)

proc sort data=have;
by person date;
run;

data want(drop=date earning);
set have;
by person date;
retain first_earning_date;
format first_earning_date last_earning_date ddmmyy10.;
if first.person then do;
sum_earning=0;
first_earning_date=date;
end;
sum_earning+earning;
if last.person then do;
last_earning_date=date;
output;
end;
run;

proc print data=want;
run;

Regular Contributor
Posts: 233

Re: Collapsing data using (arrays)

data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

proc sql;
create table want as
select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period
from have
group by person;
quit;

Capture.JPG

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 15 replies
  • 411 views
  • 7 likes
  • 5 in conversation