Hi All,
Here is the original dataset:
data m01;
a="CGA AMZN.COM CA";output;
a="CHA AMZN.COM HK";output;
run;
Output dataset like:
Word
CGA |
AMZN.COM |
CA |
CHA |
AMZN.COM |
HK |
Any idea?
Here is mine:
data m02;
set m01;
an=translate(a,"@"," ");
run;
data m02_1;
set m02;
an=tranwrd(an,"@@","@");
run;
%macro REEE;
%do i=1 %to 10;
data m02_1;
set m02_1;
an=tranwrd(an,"@@","@");
run;
%end;
%mend;
%REEE;
data m03;
set m02_1;
bn=an;
No=_N_;
j=0;
do until(j=0);
j=find(bn,"@");
i=1;
Word=substr(bn,i,j-i);
i+j;
bn=substr(bn,i,j-i);
output;
end;
run;
In fact the original dataset contains thousands of merchants description,I want to break it down into words.
What's your suggestion,please?
Thanks in advance.
You can do this in only one step using functions COUNTW and SCAN specifing the blank as separator of words.
data want ;
length word $ 20;
set m01;
do i = 1 to countw(a,' ');
word=(scan(a,i,' '));
output;
end;
run;
Regards,
more _infile_ magic at http://www2.sas.com/proceedings/sugi28/086-28.pdf
shows a way to use the parsing of the input statement on data that comes from a table or data set variable.
Then you can populate a hash table of word counters, in a single pass.
You can do this in only one step using functions COUNTW and SCAN specifing the blank as separator of words.
data want ;
length word $ 20;
set m01;
do i = 1 to countw(a,' ');
word=(scan(a,i,' '));
output;
end;
run;
Regards,
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.