BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.

dear all 

 

i have my data in the following format 

investor
3A Capital Services Limited
3A Capital Services Ltd
A M M Vellayan Sons P Ltd
Aadi Financial Advisors Llp
Aagam Holdings Private Limited
Abakkus Emerging Opportunities Fund-1
Abakkus Growth Fund-1
Aberdeen Global Indian Equity Fund Mauritius Ltd

 

I have to generate a frequency table for the words repeated in the above table 

i need the output in the following format

wordfrequency
3A2
capital2
services 2
Limited 2
Ltd3
A1
M1
M1
Vellayan 1
sons 1
p1
Aadi1
fnancial 1
Advisors 1
Llp1
Aagam1
Holdings 1
Private1
Abakkus 2
Emerging 1
Opportunities 1
Fund-12
Growth1
Aberdeen 1
Global 1
Idian  
Equity 1
fund1
Mauritius1

i am attaching a sample file in .CSV format 

 

please suggest me a SAS code 

 

thanking you in advance 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
s_lassen
Meteorite | Level 14

Please post your data as a data step (and not a .csv file), like this:

data have;
infile cards truncover;
input investor $200.;
datalines;
investor
3A Capital Services Limited
3A Capital Services Ltd
A M M Vellayan Sons P Ltd
Aadi Financial Advisors Llp
Aagam Holdings Private Limited
Abakkus Emerging Opportunities Fund-1
Abakkus Growth Fund-1
Aberdeen Global Indian Equity Fund Mauritius Ltd
;run;

If you just need to count the occurrencies of words, create a table with all the words (in upper case, as you do not care about case):

data words;
  length word $30;
  set have;
  investor=upcase(investor);
  do _N_=1 to countw(investor,' ');
    word=scan(investor,_N_,' ');
	output;
	end;
  keep word;
run;

You can then use e.g. PROC SUMMARY to get the total counts:

proc summary data=words nway;
  class word;
  output out=want(drop=_type_ rename=(_freq_=frequency));
run;

View solution in original post

8 REPLIES 8
andreas_lds
Jade | Level 19

Why is "M" twice in the result? Are the words "Ltd" and "ltd" the same? What have you tried?

First idea: a loop with scan to get the words, a hash-object to count.

srikanthyadav44
Quartz | Level 8
just as an example, i have shown it. Ltd and ltd are the same.
SAS treats them as same,its great
srikanthyadav44
Quartz | Level 8

can you please write the SAS CODE  

 

Thanking you in advance 

Kurt_Bremser
Super User

@srikanthyadav44 wrote:

can you please write the SAS CODE  

 

Thanking you in advance 


DO Statement: Iterative 

COUNTW Function 

SCAN Function 

OUTPUT Statement 

 

We have given you descriptions of the code. You were equipped at birth with a brain, so please use it.

If you need someone to do your whole work, hire someone.

srikanthyadav44
Quartz | Level 8
 
can you please show me the SAS code 
i am not able to do it 
 
thanking you in advance 
s_lassen
Meteorite | Level 14

Please post your data as a data step (and not a .csv file), like this:

data have;
infile cards truncover;
input investor $200.;
datalines;
investor
3A Capital Services Limited
3A Capital Services Ltd
A M M Vellayan Sons P Ltd
Aadi Financial Advisors Llp
Aagam Holdings Private Limited
Abakkus Emerging Opportunities Fund-1
Abakkus Growth Fund-1
Aberdeen Global Indian Equity Fund Mauritius Ltd
;run;

If you just need to count the occurrencies of words, create a table with all the words (in upper case, as you do not care about case):

data words;
  length word $30;
  set have;
  investor=upcase(investor);
  do _N_=1 to countw(investor,' ');
    word=scan(investor,_N_,' ');
	output;
	end;
  keep word;
run;

You can then use e.g. PROC SUMMARY to get the total counts:

proc summary data=words nway;
  class word;
  output out=want(drop=_type_ rename=(_freq_=frequency));
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2173 views
  • 4 likes
  • 4 in conversation