BookmarkSubscribeRSS Feed
Krishnaid
Calcite | Level 5

I am trying to create 2 letter bigram using arrays and then multiple steps (proc transpose and STDZ) to arrive the desired result.but i would like to achieve the same if there is way in minimizing all these steps and use just arrays.

 

I heard that this can be do able in e-miner/text mining module but is there a better a way to achieve through Base/Macros efficiently?

 

what I have tried is here

 

data test;
input x$1-14 ;
datalines;
test one
test two
test three
;
run;

data bigram( drop=i);
 set test;
 n+1;
 do i=1 to lengthn(x)-1;
 v=substr(x,i,2);output;
 end;
 run;


xnv
test one1te
test one1es
test one1st
test one1t
test one1o
test one1on
test one1ne
test two2te
test two2es
test two2st
test two2t
test two2t
test two2tw
test two2wo
test three3te
test three3es
test three3st
test three3t
test three3t
test three3th
test three3hr
test three3re
test three3ee
 

 

 

 

Expecting this way to minimize number of intermediate steps and computationally efficient when deal with huge number of observations.

 

Taking two letter unique bi grams from three rows and occurrence of that bigram in a given string coded as 1 else 0

 

Desired result

 

x	       te	es	st	t	o	on	ne	tw	wo	th	hr	re	ee
test one	1	1	1	1	1	1	1	0	0	0	0	0	0
test two	1	1	1	1	0	0	0	1	1	0	0	0	0
test three	1	1	1	1	0	0	0	0	0	1	1	1	1
1 REPLY 1
Reeza
Super User

I don't think arrays will work here any more efficiently because you have the variable names as the ngrams. If they were part of the data then yes an array could work.

 

One other possible method:

 

There are (26 choose 2=325) possible combinations + all single values (26) = 351  combinations. Create all and then as you find each, change the indicator variable to a 1/0.  But if your data is smaller it may be overkill here to have 351 variables. 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 788 views
  • 0 likes
  • 2 in conversation