| gvkey | fyear | TA | Sum (want) | 
| 1700 | 1995 | 3311.662 | 6691.929 | 
| 1700 | 1995 | 1599.234 | 6691.929 | 
| 1700 | 1995 | 656.226 | 6691.929 | 
| 1700 | 1995 | 564.76 | 6691.929 | 
| 1700 | 1995 | 560.047 | 6691.929 | 
| 1700 | 1996 | 4182.832 | 10756.85 | 
| 1700 | 1996 | 2068.554 | 10756.85 | 
| 1700 | 1996 | 1727.069 | 10756.85 | 
| 1700 | 1996 | 1696.431 | 10756.85 | 
| 1700 | 1996 | 1081.963 | 10756.85 | 
| 1700 | 1996 | 841.204 | 10756.85 | 
| 1700 | 1996 | 577.841 | 10756.85 | 
| 1700 | 1996 | . | 10756.85 | 
| 84578 | 1995 | 3700.925 | . | 
| 84578 | 1995 | 1312.852 | . | 
| 84578 | 1996 | 1131.176 | . | 
| 84578 | 1996 | 1099.786 | . | 
| 84578 | 1996 | 1067.566 | . | 
| 84578 | 1996 | 848.657 | . | 
Here, I want to add TA for the first 5 highest TA. if TA for each year is less than 5 observations,
I want them to be shown as missing sum as in for gvkey 84578.
Can anyone help me?
data have;
input gvkey	fyear	TA;*	Sum (want);
cards;
1700	1995	3311.662	6691.929
1700	1995	1599.234	6691.929
1700	1995	656.226	6691.929
1700	1995	564.76	6691.929
1700	1995	560.047	6691.929
1700	1996	4182.832	10756.85
1700	1996	2068.554	10756.85
1700	1996	1727.069	10756.85
1700	1996	1696.431	10756.85
1700	1996	1081.963	10756.85
1700	1996	841.204	10756.85
1700	1996	577.841	10756.85
1700	1996	.	10756.85
84578	1995	3700.925	.
84578	1995	1312.852	.
84578	1996	1131.176	.
84578	1996	1099.786	.
84578	1996	1067.566	.
84578	1996	848.657	.
;
proc sort data=have out=_have;
 by gvkey fyear descending ta;
run;
data want;
 do _n_=1 by 1 until(last.fyear);
  set _have;
  by gvkey fyear;
  if _n_<=5 then sum=sum(sum,ta);
 end;
 if _n_<5 then sum=.;
 do _n_=1 to _n_;
  set _have;
  output;
 end;
run;
data have;
input gvkey	fyear	TA;*	Sum (want);
cards;
1700	1995	3311.662	6691.929
1700	1995	1599.234	6691.929
1700	1995	656.226	6691.929
1700	1995	564.76	6691.929
1700	1995	560.047	6691.929
1700	1996	4182.832	10756.85
1700	1996	2068.554	10756.85
1700	1996	1727.069	10756.85
1700	1996	1696.431	10756.85
1700	1996	1081.963	10756.85
1700	1996	841.204	10756.85
1700	1996	577.841	10756.85
1700	1996	.	10756.85
84578	1995	3700.925	.
84578	1995	1312.852	.
84578	1996	1131.176	.
84578	1996	1099.786	.
84578	1996	1067.566	.
84578	1996	848.657	.
;
proc sort data=have out=_have;
 by gvkey fyear descending ta;
run;
data want;
 do _n_=1 by 1 until(last.fyear);
  set _have;
  by gvkey fyear;
  if _n_<=5 then sum=sum(sum,ta);
 end;
 if _n_<5 then sum=.;
 do _n_=1 to _n_;
  set _have;
  output;
 end;
run;
I have some observations where TA is missing but there more than 5 TA observations. I want them to be missing too. As long as I have 5 given TA observations then I want their sum even if the 6th observation is missing
gvkey fyear ta sum
1000 1996 21.21 .
1000 1996 . .
1000 1996 . .
1000 1996 . .
1000 1996 . .
1000 1996 . .
@abdulla try-
data have;
input gvkey	fyear	TA;*	Sum (want);
cards;
1700	1995	3311.662	6691.929
1700	1995	1599.234	6691.929
1700	1995	656.226	6691.929
1700	1995	564.76	6691.929
1700	1995	560.047	6691.929
1700	1996	4182.832	10756.85
1700	1996	2068.554	10756.85
1700	1996	1727.069	10756.85
1700	1996	1696.431	10756.85
1700	1996	1081.963	10756.85
1700	1996	841.204	10756.85
1700	1996	577.841	10756.85
1700	1996	.	10756.85
84578	1995	3700.925	.
84578	1995	1312.852	.
84578	1996	1131.176	.
84578	1996	1099.786	.
84578	1996	1067.566	.
84578	1996	848.657	.
1000 1996 21.21 .
1000 1996 . .
1000 1996 . .
1000 1996 . .
1000 1996 . .
1000 1996 . .
;
proc sort data=have out=_have;
 by gvkey fyear descending ta;
run;
data want;
 do _n_=1 by 1 until(last.fyear);
  set _have;
  by gvkey fyear;
  if _n_<=5 then do;
    _n=sum(_n,n(ta));
    sum=sum(sum,ta);
  end;
 end;
 if _n_<5 or _n<5 then sum=.;
 do _n_=1 to _n_;
  set _have;
  output;
 end;
 drop _n;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
