DATA Step, Macro, Functions and more

making all possible combinations of all words from a given string(N-Grams Concept)

Accepted Solution Solved
Reply
Contributor
Posts: 64
Accepted Solution

making all possible combinations of all words from a given string(N-Grams Concept)

Hi Experts,

 

I am trying to get all combinations of words in preceding order available in a given string e.g.

 

Suppose I have a string  like "The cow jumps over the moon"

and want combinations like..

The cow

The jumps

The over

The the

The Moon

cow jumps

cow over

cow the

cow moon

jumps over

jumps the

jumps moon

over the

over moon

the moon

 

please help me to achieve the same..

I m trying the following code:

 

data test;
sen = "The cow jumps over the moon";
run;


data test1;
set test;
length sen1 $100.;
sen1="";
retain sen1;
do i=0 to count(compbl(TRIM(sen))," ");
do j=1 to 2;
sen1=compbl(sen1||" "||compbl(trim(scan(compbl(TRIM(sen)),i+j," ","MO"))));
ngram= j;
r=count(TRIM(sen)," ");
x=count(compbl(TRIM(sen))," ");
output;
end;
sen1="";
end;
if ngram=2;
run;

 

Thnx

Rahul


Accepted Solutions
Solution
‎01-03-2017 08:57 AM
Super User
Posts: 5,497

Re: making all possible combinations of all words from a given string(N-Grams Concept)

Posted in reply to Rahul_SAS

Using the data you already have:

 

data want;

set test;

nitems = countw(sen);

length combo $ 100;

if nitems > 1;

do i=1 to nitems - 1;

do j=i+1 to nitems;

   combo = catx(' ', scan(sen, i), scan(sen, j));

   output;

end;

run;

View solution in original post


All Replies
Super Contributor
Posts: 340

Re: making all possible combinations of all words from a given string(N-Grams Concept)

[ Edited ]
Posted in reply to Rahul_SAS
Data A;
  Input Text : $ Nr @@;
  Datalines;
The 1 cow 2 jumps 3 over 4 the 5 moon 6
;

***2nd select did not really make sense ..*; Proc SQL; Create Table B As Select a1.Text As Text1,a2.Text As Text2 From A a1,A a2 Where a1.Nr lt a2.Nr ; Quit;
Solution
‎01-03-2017 08:57 AM
Super User
Posts: 5,497

Re: making all possible combinations of all words from a given string(N-Grams Concept)

Posted in reply to Rahul_SAS

Using the data you already have:

 

data want;

set test;

nitems = countw(sen);

length combo $ 100;

if nitems > 1;

do i=1 to nitems - 1;

do j=i+1 to nitems;

   combo = catx(' ', scan(sen, i), scan(sen, j));

   output;

end;

run;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 209 views
  • 1 like
  • 3 in conversation