dear all
i have my data in the following format
investor |
3A Capital Services Limited |
3A Capital Services Ltd |
A M M Vellayan Sons P Ltd |
Aadi Financial Advisors Llp |
Aagam Holdings Private Limited |
Abakkus Emerging Opportunities Fund-1 |
Abakkus Growth Fund-1 |
Aberdeen Global Indian Equity Fund Mauritius Ltd |
I have to generate a frequency table for the words repeated in the above table
i need the output in the following format
word | frequency |
3A | 2 |
capital | 2 |
services | 2 |
Limited | 2 |
Ltd | 3 |
A | 1 |
M | 1 |
M | 1 |
Vellayan | 1 |
sons | 1 |
p | 1 |
Aadi | 1 |
fnancial | 1 |
Advisors | 1 |
Llp | 1 |
Aagam | 1 |
Holdings | 1 |
Private | 1 |
Abakkus | 2 |
Emerging | 1 |
Opportunities | 1 |
Fund-1 | 2 |
Growth | 1 |
Aberdeen | 1 |
Global | 1 |
Idian | |
Equity | 1 |
fund | 1 |
Mauritius | 1 |
i am attaching a sample file in .CSV format
please suggest me a SAS code
thanking you in advance
Please post your data as a data step (and not a .csv file), like this:
data have;
infile cards truncover;
input investor $200.;
datalines;
investor
3A Capital Services Limited
3A Capital Services Ltd
A M M Vellayan Sons P Ltd
Aadi Financial Advisors Llp
Aagam Holdings Private Limited
Abakkus Emerging Opportunities Fund-1
Abakkus Growth Fund-1
Aberdeen Global Indian Equity Fund Mauritius Ltd
;run;
If you just need to count the occurrencies of words, create a table with all the words (in upper case, as you do not care about case):
data words;
length word $30;
set have;
investor=upcase(investor);
do _N_=1 to countw(investor,' ');
word=scan(investor,_N_,' ');
output;
end;
keep word;
run;
You can then use e.g. PROC SUMMARY to get the total counts:
proc summary data=words nway;
class word;
output out=want(drop=_type_ rename=(_freq_=frequency));
run;
Why is "M" twice in the result? Are the words "Ltd" and "ltd" the same? What have you tried?
First idea: a loop with scan to get the words, a hash-object to count.
Ltd and ltd are different strings. If you want these to be considered identical, you need to convert them to a common upper or lower case.
can you please write the SAS CODE
Thanking you in advance
@srikanthyadav44 wrote:
can you please write the SAS CODE
Thanking you in advance
We have given you descriptions of the code. You were equipped at birth with a brain, so please use it.
If you need someone to do your whole work, hire someone.
First, expand the dataset so that you have one word only in your variable (DO loop, COUNTW, SCAN, OUTPUT).
Then run PROC FREQ on that.
Please post your data as a data step (and not a .csv file), like this:
data have;
infile cards truncover;
input investor $200.;
datalines;
investor
3A Capital Services Limited
3A Capital Services Ltd
A M M Vellayan Sons P Ltd
Aadi Financial Advisors Llp
Aagam Holdings Private Limited
Abakkus Emerging Opportunities Fund-1
Abakkus Growth Fund-1
Aberdeen Global Indian Equity Fund Mauritius Ltd
;run;
If you just need to count the occurrencies of words, create a table with all the words (in upper case, as you do not care about case):
data words;
length word $30;
set have;
investor=upcase(investor);
do _N_=1 to countw(investor,' ');
word=scan(investor,_N_,' ');
output;
end;
keep word;
run;
You can then use e.g. PROC SUMMARY to get the total counts:
proc summary data=words nway;
class word;
output out=want(drop=_type_ rename=(_freq_=frequency));
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.