dear all,
i have the data of in the following format
company_name | year | percent_shares |
20 Microns Ltd. | 2010 | 0.07 |
20 Microns Ltd. | 2010 | 0.77 |
20 Microns Ltd. | 2010 | 2.5 |
20 Microns Ltd. | 2010 | 0.57 |
20 Microns Ltd. | 2010 | 0.35 |
20 Microns Ltd. | 2010 | 1.95 |
20 Microns Ltd. | 2010 | 1.85 |
20 Microns Ltd. | 2010 | 0.1 |
20 Microns Ltd. | 2010 | 0.5 |
20 Microns Ltd. | 2010 | 10.76 |
20 Microns Ltd. | 2010 | 0.37 |
20 Microns Ltd. | 2010 | 0.01 |
20 Microns Ltd. | 2010 | 9.26 |
20 Microns Ltd. | 2010 | 0.61 |
20 Microns Ltd. | 2010 | 0.13 |
20 Microns Ltd. | 2010 | 0.14 |
20 Microns Ltd. | 2010 | 0.08 |
20 Microns Ltd. | 2010 | 0.12 |
20 Microns Ltd. | 2010 | 0.03 |
20 Microns Ltd. | 2010 | 1.65 |
20 Microns Ltd. | 2010 | 0.21 |
20 Microns Ltd. | 2010 | 0.03 |
20 Microns Ltd. | 2010 | 0.04 |
20 Microns Ltd. | 2010 | 0.03 |
20 Microns Ltd. | 2010 | 1.95 |
20 Microns Ltd. | 2010 | 0.1 |
20 Microns Ltd. | 2010 | 0 |
20 Microns Ltd. | 2010 | 6.81 |
20 Microns Ltd. | 2010 | 1.25 |
20 Microns Ltd. | 2010 | 0.58 |
20 Microns Ltd. | 2010 | 0.18 |
20 Microns Ltd. | 2010 | 0.18 |
20 Microns Ltd. | 2010 | 1.05 |
20 Microns Ltd. | 2010 | 0.36 |
20 Microns Ltd. | 2010 | 1.39 |
20 Microns Ltd. | 2010 | 0.07 |
20 Microns Ltd. | 2010 | 0.64 |
20 Microns Ltd. | 2010 | 0.07 |
20 Microns Ltd. | 2010 | 0.33 |
20 Microns Ltd. | 2010 | 0.17 |
20 Microns Ltd. | 2010 | 18.67 |
20 Microns Ltd. | 2010 | 1.65 |
20 Microns Ltd. | 2010 | 0.77 |
20 Microns Ltd. | 2010 | 2.5 |
20 Microns Ltd. | 2010 | 0.57 |
20 Microns Ltd. | 2010 | 0.35 |
20 Microns Ltd. | 2010 | 1.95 |
20 Microns Ltd. | 2010 | 1.85 |
20 Microns Ltd. | 2010 | 10.8 |
20 Microns Ltd. | 2010 | 18.94 |
20 Microns Ltd. | 2010 | 0.14 |
20 Microns Ltd. | 2010 | 0.12 |
20 Microns Ltd. | 2010 | 0.08 |
20 Microns Ltd. | 2010 | 0.12 |
20 Microns Ltd. | 2010 | 0.21 |
20 Microns Ltd. | 2010 | 1.95 |
20 Microns Ltd. | 2010 | 6.81 |
20 Microns Ltd. | 2010 | 0.06 |
20 Microns Ltd. | 2010 | 1.19 |
20 Microns Ltd. | 2010 | 0.58 |
20 Microns Ltd. | 2010 | 1.05 |
20 Microns Ltd. | 2010 | 1.39 |
20 Microns Ltd. | 2010 | 0.33 |
3I Infotech Ltd. | 2010 | 2.73 |
3I Infotech Ltd. | 2010 | 1.39 |
3I Infotech Ltd. | 2010 | 1.59 |
3I Infotech Ltd. | 2010 | 8.12 |
3I Infotech Ltd. | 2010 | 2.57 |
3I Infotech Ltd. | 2010 | 1.75 |
3I Infotech Ltd. | 2010 | 2.07 |
3I Infotech Ltd. | 2010 | 3.5 |
3I Infotech Ltd. | 2010 | 1.9 |
3I Infotech Ltd. | 2010 | 2.07 |
3I Infotech Ltd. | 2010 | 23.13 |
3M India Ltd. | 2010 | 1.2 |
3M India Ltd. | 2010 | 1.93 |
3M India Ltd. | 2010 | 1.48 |
3M India Ltd. | 2010 | 2.9 |
3M India Ltd. | 2010 | 1.93 |
3M India Ltd. | 2010 | 76 |
3P Land Holdings Ltd. | 2010 | 4.87 |
3P Land Holdings Ltd. | 2010 | 1.2 |
3P Land Holdings Ltd. | 2010 | 5.56 |
3P Land Holdings Ltd. | 2010 | 2.11 |
3P Land Holdings Ltd. | 2010 | 4.58 |
3P Land Holdings Ltd. | 2010 | 2.24 |
3P Land Holdings Ltd. | 2010 | 0.34 |
3P Land Holdings Ltd. | 2010 | 2.74 |
3P Land Holdings Ltd. | 2010 | 3.47 |
3P Land Holdings Ltd. | 2010 | 4.59 |
3P Land Holdings Ltd. | 2010 | 21.92 |
3P Land Holdings Ltd. | 2010 | 2.04 |
3P Land Holdings Ltd. | 2010 | 4.65 |
3P Land Holdings Ltd. | 2010 | 3.58 |
3P Land Holdings Ltd. | 2010 | 13.94 |
it represents the company name, year and shareholding percentage. it the panel data of nearly 5000 companies from 2010 to 2019.
i have to find out the sum of first three largest shareholdings for each company for every year from 2010 to 2019.
my output should have the following information
company_name | year | sum of three largest shareholdigns |
please suggest the appropriate SAS CODE
thanks in advance
Compute the sum for ALL companies using PROC SUMMARY, then sort so the largest are first, then take the first three each year.
proc summary data=have nway;
class company_name year;
var percent_shares;
output out=_sums_ sum=;
run;
proc sort data=_sums_;
by year descending percent_shares;
run;
Compute the sum for ALL companies using PROC SUMMARY, then sort so the largest are first, then take the first three each year.
proc summary data=have nway;
class company_name year;
var percent_shares;
output out=_sums_ sum=;
run;
proc sort data=_sums_;
by year descending percent_shares;
run;
And then I thought of a better approach
proc summary data=have nway;
class year company_name;
var percent_shares;
output out=_sums_ sum=;
run;
proc rank data=_sums_ descending out=ranks;
by year;
var percent_shares;
ranks percent_shares_rank;
run;
The output of PROC RANK, variable named PERCENT_SHARES_RANK has values 1 2 3 for the top 3 each year.
You have a much better chance of getting an answer if you present your data as data step code, like this:
data have;
length company_name $40 year 8 percent_shares 8;
infile cards dsd delimiter=',';
input company_name year percent_shares;
cards;
20 Microns Ltd.,2010,0.07
20 Microns Ltd.,2010,0.77
20 Microns Ltd.,2010,2.5
20 Microns Ltd.,2010,0.57
20 Microns Ltd.,2010,0.35
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,1.85
20 Microns Ltd.,2010,0.1
20 Microns Ltd.,2010,0.5
20 Microns Ltd.,2010,10.76
20 Microns Ltd.,2010,0.37
20 Microns Ltd.,2010,0.01
20 Microns Ltd.,2010,9.26
20 Microns Ltd.,2010,0.61
20 Microns Ltd.,2010,0.13
20 Microns Ltd.,2010,0.14
20 Microns Ltd.,2010,0.08
20 Microns Ltd.,2010,0.12
20 Microns Ltd.,2010,0.03
20 Microns Ltd.,2010,1.65
20 Microns Ltd.,2010,0.21
20 Microns Ltd.,2010,0.03
20 Microns Ltd.,2010,0.04
20 Microns Ltd.,2010,0.03
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,0.1
20 Microns Ltd.,2010,0
20 Microns Ltd.,2010,6.81
20 Microns Ltd.,2010,1.25
20 Microns Ltd.,2010,0.58
20 Microns Ltd.,2010,0.18
20 Microns Ltd.,2010,0.18
20 Microns Ltd.,2010,1.05
20 Microns Ltd.,2010,0.36
20 Microns Ltd.,2010,1.39
20 Microns Ltd.,2010,0.07
20 Microns Ltd.,2010,0.64
20 Microns Ltd.,2010,0.07
20 Microns Ltd.,2010,0.33
20 Microns Ltd.,2010,0.17
20 Microns Ltd.,2010,18.67
20 Microns Ltd.,2010,1.65
20 Microns Ltd.,2010,0.77
20 Microns Ltd.,2010,2.5
20 Microns Ltd.,2010,0.57
20 Microns Ltd.,2010,0.35
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,1.85
20 Microns Ltd.,2010,10.8
20 Microns Ltd.,2010,18.94
20 Microns Ltd.,2010,0.14
20 Microns Ltd.,2010,0.12
20 Microns Ltd.,2010,0.08
20 Microns Ltd.,2010,0.12
20 Microns Ltd.,2010,0.21
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,6.81
20 Microns Ltd.,2010,0.06
20 Microns Ltd.,2010,1.19
20 Microns Ltd.,2010,0.58
20 Microns Ltd.,2010,1.05
20 Microns Ltd.,2010,1.39
20 Microns Ltd.,2010,0.33
3I Infotech Ltd.,2010,2.73
3I Infotech Ltd.,2010,1.39
3I Infotech Ltd.,2010,1.59
3I Infotech Ltd.,2010,8.12
3I Infotech Ltd.,2010,2.57
3I Infotech Ltd.,2010,1.75
3I Infotech Ltd.,2010,2.07
3I Infotech Ltd.,2010,3.5
3I Infotech Ltd.,2010,1.9
3I Infotech Ltd.,2010,2.07
3I Infotech Ltd.,2010,23.13
3M India Ltd.,2010,1.2
3M India Ltd.,2010,1.93
3M India Ltd.,2010,1.48
3M India Ltd.,2010,2.9
3M India Ltd.,2010,1.93
3M India Ltd.,2010,76
3P Land Holdings Ltd.,2010,4.87
3P Land Holdings Ltd.,2010,1.2
3P Land Holdings Ltd.,2010,5.56
3P Land Holdings Ltd.,2010,2.11
3P Land Holdings Ltd.,2010,4.58
3P Land Holdings Ltd.,2010,2.24
3P Land Holdings Ltd.,2010,0.34
3P Land Holdings Ltd.,2010,2.74
3P Land Holdings Ltd.,2010,3.47
3P Land Holdings Ltd.,2010,4.59
3P Land Holdings Ltd.,2010,21.92
3P Land Holdings Ltd.,2010,2.04
3P Land Holdings Ltd.,2010,4.65
3P Land Holdings Ltd.,2010,3.58
3P Land Holdings Ltd.,2010,13.94
;run;
Here is a solution:
proc sort;
by company_name year descending percent_shares;
run;
data want;
Three_largest=0;
do _N_=1 by 1 until(last.year);
set have;
by company_name year;
if _N_<=3 then
Three_largest+percent_shares;
end;
drop percent_shares;
run;
If your data are already grouped by company name and year, then you could keep an updated array of the 3 largest percentages in a data step: At the end of the year calculate the sum:
data have;
length company_name $40 year 8 percent_shares 8;
infile cards dsd delimiter=',';
input company_name year percent_shares;
cards;
20 Microns Ltd.,2010,0.07
20 Microns Ltd.,2010,0.77
20 Microns Ltd.,2010,2.5
20 Microns Ltd.,2010,0.57
20 Microns Ltd.,2010,0.35
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,1.85
20 Microns Ltd.,2010,0.1
20 Microns Ltd.,2010,0.5
20 Microns Ltd.,2010,10.76
20 Microns Ltd.,2010,0.37
20 Microns Ltd.,2010,0.01
20 Microns Ltd.,2010,9.26
20 Microns Ltd.,2010,0.61
20 Microns Ltd.,2010,0.13
20 Microns Ltd.,2010,0.14
20 Microns Ltd.,2010,0.08
20 Microns Ltd.,2010,0.12
20 Microns Ltd.,2010,0.03
20 Microns Ltd.,2010,1.65
20 Microns Ltd.,2010,0.21
20 Microns Ltd.,2010,0.03
20 Microns Ltd.,2010,0.04
20 Microns Ltd.,2010,0.03
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,0.1
20 Microns Ltd.,2010,0
20 Microns Ltd.,2010,6.81
20 Microns Ltd.,2010,1.25
20 Microns Ltd.,2010,0.58
20 Microns Ltd.,2010,0.18
20 Microns Ltd.,2010,0.18
20 Microns Ltd.,2010,1.05
20 Microns Ltd.,2010,0.36
20 Microns Ltd.,2010,1.39
20 Microns Ltd.,2010,0.07
20 Microns Ltd.,2010,0.64
20 Microns Ltd.,2010,0.07
20 Microns Ltd.,2010,0.33
20 Microns Ltd.,2010,0.17
20 Microns Ltd.,2010,18.67
20 Microns Ltd.,2010,1.65
20 Microns Ltd.,2010,0.77
20 Microns Ltd.,2010,2.5
20 Microns Ltd.,2010,0.57
20 Microns Ltd.,2010,0.35
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,1.85
20 Microns Ltd.,2010,10.8
20 Microns Ltd.,2010,18.94
20 Microns Ltd.,2010,0.14
20 Microns Ltd.,2010,0.12
20 Microns Ltd.,2010,0.08
20 Microns Ltd.,2010,0.12
20 Microns Ltd.,2010,0.21
20 Microns Ltd.,2010,1.95
20 Microns Ltd.,2010,6.81
20 Microns Ltd.,2010,0.06
20 Microns Ltd.,2010,1.19
20 Microns Ltd.,2010,0.58
20 Microns Ltd.,2010,1.05
20 Microns Ltd.,2010,1.39
20 Microns Ltd.,2010,0.33
3I Infotech Ltd.,2010,2.73
3I Infotech Ltd.,2010,1.39
3I Infotech Ltd.,2010,1.59
3I Infotech Ltd.,2010,8.12
3I Infotech Ltd.,2010,2.57
3I Infotech Ltd.,2010,1.75
3I Infotech Ltd.,2010,2.07
3I Infotech Ltd.,2010,3.5
3I Infotech Ltd.,2010,1.9
3I Infotech Ltd.,2010,2.07
3I Infotech Ltd.,2010,23.13
3M India Ltd.,2010,1.2
3M India Ltd.,2010,1.93
3M India Ltd.,2010,1.48
3M India Ltd.,2010,2.9
3M India Ltd.,2010,1.93
3M India Ltd.,2010,76
3P Land Holdings Ltd.,2010,4.87
3P Land Holdings Ltd.,2010,1.2
3P Land Holdings Ltd.,2010,5.56
3P Land Holdings Ltd.,2010,2.11
3P Land Holdings Ltd.,2010,4.58
3P Land Holdings Ltd.,2010,2.24
3P Land Holdings Ltd.,2010,0.34
3P Land Holdings Ltd.,2010,2.74
3P Land Holdings Ltd.,2010,3.47
3P Land Holdings Ltd.,2010,4.59
3P Land Holdings Ltd.,2010,21.92
3P Land Holdings Ltd.,2010,2.04
3P Land Holdings Ltd.,2010,4.65
3P Land Holdings Ltd.,2010,3.58
3P Land Holdings Ltd.,2010,13.94
;run;
data want (drop=_: percent_shares);
set have;
by company_name year notsorted;
array pct {0:3} _newpct _pct1-_pct3;
retain _pct: ;
if first.year then call missing(of _pct:);
if percent_shares>_pct1 then do;
_newpct=percent_shares;
call sortn(of pct{*});
end;
if last.year;
sum_of_3_largest=sum(of _pct:);
run;
In future posts, please post data in usable form, as shown by @s_lassen .
Proc Summary can do most of the task, unfortunately there seems to be no way avoiding the transpose that takes place:
proc summary data=have nway;
class company_name year;
var percent_shares;
output out=top3(drop= _:) idgroup(max(percent_shares) out[3] (percent_shares)=) / autolabel autoname;
run;
data want;
set top3;
sum_top3 = sum(of percent_shares:);
drop percent_shares:;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.