BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ammarhm
Lapis Lazuli | Level 10

Hi all

Could anyone please help me with this:

I have a data set that looks like this:

                 

Person EarningDate
1122001-01-23
1342001-02-03
1232001-02-25
2452001-03-04
2322001-03-27
372001-04-04
3342001-05-05
3652001-05-26
3332001-06-06
3232001-07-02
4332001-02-03
4122001-03-05

What I am trying to do is the following

PersonSum EarningEarning Period
1total sum for person 1Last earning date - First earning day person 1
2total sum for person 2Last earning date - First earning day person 2
3total sum for person 3Last earning date - First earning day person 3
4total sum for person 4Last earning date - First earning day person 4

I a guess this would be done through arrays but I am really clueless....

Could anyone please help me with the code??

Kind regards

ammarhm

1 ACCEPTED SOLUTION

Accepted Solutions
Hima
Obsidian | Level 7

I hope this is what you need as the output.

Capture.JPG

data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

proc sql;
create table want1 as
select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period
from have
group by person;
quit;

proc sql;
create table want2 as
select distinct person, earning as last_earning
from have
group by person
having date=max(date);
quit;

data want;
merge want1(in=a) want2(in=b);
by person;
if a=b;
run;

View solution in original post

15 REPLIES 15
data_null__
Jade | Level 19

You don't need arrays.

data person;
   infile cards4 dsd firstobs=2;
  
input person earning date :mmddyy.;
  
format date mmddyy.;
  
cards;
Person,Earning,Date
1,12,1/23/2001
1,34,2/3/2001
1,23,2/25/2001
2,45,3/4/2001
2,32,3/27/2001
3,7,4/4/2001
3,34,5/5/2001
3,65,5/26/2001
3,33,6/6/2001
3,23,7/2/2001
4,33,2/3/2001
4,12,3/5/2001
;;;;
   run;
proc print;
  
run;
proc summary data=person nway;
  
class person;
   output out=summary sum(earning)=earning min(date)=min max(date)=max;
  
run;
data summary;
   set summary;
   period = max-min;
  
run;
proc print;
  
run;
ammarhm
Lapis Lazuli | Level 10

Great one data_null

Much appreciated

ammarhm
Lapis Lazuli | Level 10

Just another question

Would you be so kind and tell me how to add another column to the output dataset showing the last earning per person?

Kind regards

Hima
Obsidian | Level 7

To add the last earning date column


data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

Edited:

proc sql;

create table want1 as

select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period,

max(date) as max_date format yymmdd10.

from have

group by person;

quit;


Capture.JPG
ammarhm
Lapis Lazuli | Level 10

Thanks Hima

I was looking into having the last earning (not the last earning date)

Any suggestions? The proc sql is a great option

Kind regards

Hima
Obsidian | Level 7

You mean the last earning for person 1 is  23 person 2 is 32 etc...

ammarhm
Lapis Lazuli | Level 10

Yes exactly, thanks in advance

ammarhm
Lapis Lazuli | Level 10

The real reason is that I want to modify the table so I would get the sum of all earnings per person excluding the last one, so I thought I might do a sum(earning)-last earning

you get the point, i am sure

Hima
Obsidian | Level 7

I hope this is what you need as the output.

Capture.JPG

data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

proc sql;
create table want1 as
select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period
from have
group by person;
quit;

proc sql;
create table want2 as
select distinct person, earning as last_earning
from have
group by person
having date=max(date);
quit;

data want;
merge want1(in=a) want2(in=b);
by person;
if a=b;
run;

data_null__
Jade | Level 19

Yes that can be done with PROC SUMMARY IDGROUP.

data person;
   infile cards4 dsd firstobs=2;
  
input person earning date :mmddyy.;
  
format date mmddyy.;
  
cards;
Person,Earning,Date
1,12,1/23/2001
1,34,2/3/2001
1,23,2/25/2001
2,45,3/4/2001
2,32,3/27/2001
3,7,4/4/2001
3,34,5/5/2001
3,65,5/26/2001
3,33,6/6/2001
3,23,7/2/2001
4,33,2/3/2001
4,12,3/5/2001
;;;;
   run;
proc print;
  
run;
proc summary data=person nway;
  
class person;
   output out=summary(drop=_:) sum(earning)=earning min(date)=min
     
idgroup(max(date) out(date earning)=max lastearning);
   run;
data summary;
   set summary;
   period = max-min;
  
run;
proc print;
  
run;

9-8-2014 9-00-58 PM.png
Ksharp
Super User

NULL ,

You can use RANGE as well.No need to overwrite the table .

  proc summary data=person nway; 
   class person;
   output out=summary sum(earning)= range(date)= /autoname; 
   run; 

Xia Keshan

data_null__
Jade | Level 19

Good point.  I have modified.

data person;
   infile cards4 dsd firstobs=2;
  
input person earning date :mmddyy.;
  
format date mmddyy.;
  
cards;
Person,Earning,Date
1,12,1/23/2001
1,34,2/3/2001
1,23,2/25/2001
2,45,3/4/2001
2,32,3/27/2001
3,7,4/4/2001
3,34,5/5/2001
3,65,5/26/2001
3,33,6/6/2001
3,23,7/2/2001
4,33,2/3/2001
4,12,3/5/2001
;;;;
   run;
proc print;
  
run;
proc summary data=person nway;
  
class person;
   output out=summary(drop=_:) sum(earning)=earning min(date)=min range(date)=period
     
idgroup(max(date) out(date earning)=max lastearning);
   format period: ;
   run;
proc print;
  
run;
9-9-2014 8-21-50 AM.png


stat_sas
Ammonite | Level 13

proc sort data=have;
by person date;
run;

data want(drop=date earning);
set have;
by person date;
retain first_earning_date;
format first_earning_date last_earning_date ddmmyy10.;
if first.person then do;
sum_earning=0;
first_earning_date=date;
end;
sum_earning+earning;
if last.person then do;
last_earning_date=date;
output;
end;
run;

proc print data=want;
run;

Hima
Obsidian | Level 7

data have;
input person earning Date;
format Date yymmdd10.;
informat Date yymmdd10.;
datalines;
1 12 2001-01-23
1 34 2001-02-03
1 23 2001-02-25
2 45 2001-03-04
2 32 2001-03-27
3  7 2001-04-04
3 34 2001-05-05
3 65 2001-05-26
3 33 2001-06-06
3 23 2001-07-02
4 33 2001-02-03
4 12 2001-03-05
;
run;

proc sql;
create table want as
select distinct person, sum(earning) as sum_earning,max(date)-min(date) as earning_period
from have
group by person;
quit;

Capture.JPG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 15 replies
  • 2151 views
  • 7 likes
  • 5 in conversation