DATA Step, Macro, Functions and more

Anagrams Within Words

Reply
Trusted Advisor
Posts: 1,301

Anagrams Within Words

It has been almost forever since I shared a programming puzzle here.  Let's see how much interest there is with a new one today.

Given two words, determine if the first word, or any anagram of it, appears in consecutive characters of the second word. For instance, cat appears as an anagram in the first three letters of actor, but car does not appear as an anagram in actor even though all the letters of car appear in actor.

Your task is to write a macro, function, data step or ds2 package to determine if an anagram is present in a word.

This puzzle is from the following source:

Programming Praxis

http://programmingpraxis.com/2014/02/21/anagrams-within-words/

Regards,

FriedEgg

Regular Contributor
Posts: 217

Re: Anagrams Within Words

   

13        
14         GOPTIONS ACCESSIBLE;
15         data _null_;
16         yes = 0;
17         no  = 0;
18         word1='cat';
19         word2='actor';
20         first3=substr(word2,1,3);
21         do i = 1 to 3;
22           if i=1 then counter=0;
23           if index(first3,substr(word1,i,1))> 0 then counter+1; else counter+0;
24         end;
25           if counter = 3 then yes = 1;
26           else if counter < 3 then no  = 1;
27         put counter= yes= no=;
28         run;

counter=3 yes=1 no=0

13        
14         GOPTIONS ACCESSIBLE;
15         data _null_;
16         yes = 0;
17         no  = 0;
18         word1='car';
19         word2='actor';
20         first3=substr(word2,1,3);
21         do i = 1 to 3;
22           if i=1 then counter=0;
23           if index(first3,substr(word1,i,1))> 0 then counter+1; else counter+0;
24         end;
25           if counter = 3 then yes = 1;
26           else if counter < 3 then no  = 1;
27         put counter= yes= no=;
28         run;

counter=2 yes=0 no=1

 

   

Trusted Advisor
Posts: 1,301

Re: Anagrams Within Words

Nicely done jwillis.  Might I suggest making you code more flexible to handle search terms of a varying length?

Regular Contributor
Posts: 217

Re: Anagrams Within Words

Absolutely!  I was thinking quick and nimble, down and dirty.  I can also place the logic in a Macro; Place the code in a stored process(learning this);  Adapt the code to look at the middle(?) three(?); adapt the code to look in the full length of a word; Make the do start and end values variable values instead of one and three, etc.... Smiley Happy  I was also testing myself on Enterprise Guide since I'm trying to go from a Base SAS lifer to "new age" processing. Smiley Happy

Trusted Advisor
Posts: 1,301

Re: Anagrams Within Words

I am glad you found it useful.  I used to post these more frequently in the past, you can look back at some of the previous ones also:

Trusted Advisor
Posts: 1,301

Re: Anagrams Within Words

data anagram;

input ( word container ) ( : $100. );

cards;

cat     actor

dinner  thundering

cab     actor

num     immunoglobulin

;

run;

data foobar;

    length word container _word_ _compare_ $ 100

    set anagram;

    * prepare search word;

    array w[100] $ 1 _temporary_;

    call pokelong( word || repeat( '00'x , 99 - length( word ) ) , addrlong( w[1] ) , 100 );

    call sortc( of w

  • );
  •     _word_ = peekclong( addrlong( w[101 - length(word)] ) , length( word ) );

        * parse substring from container for comparison;

        array c[100] $ 1 _temporary_;

        do i = 1 to length( container ) - length( word );

            call pokelong( substr( container , i , length( word ) ) || repeat( '00'x , 99 - length( word ) )

                           , addrlong( c[1] )

                           , 100

              );

            call sortc( of c

  • );
  •         _compare_ = peekclong( addrlong( c[101 - length(word)] ) , length( word ) );

            if _word_ ne _compare_ then continue;

            else output;

            leave;

          end;

      run;

    Respected Advisor
    Posts: 4,919

    Re: Anagrams Within Words

    Nice to get a new challenge from you Matt!

    data want;

    input str :$10. source &:$50.;

    array l{10} $ l1-l10;

    do i = 1 to dim(l);

           l{i} = char(str, i);

           end;

    do i = 1 by 1;

           rc = lexperm(i, of l{*});

           if rc < 0 then leave;

           if index(source, cats(of l{*})) > 0 then leave;

           end;

    found = rc >= 0;

    keep str source found;

    datalines;

    cat actor

    car actor

    matt owns a tamtam

    ;

    proc print data=want noobs; run;

    PG

    PG
    Respected Advisor
    Posts: 3,156

    Re: Anagrams Within Words

    Echo with PG. Matt should post more of these cool stuff. Here is another approach using Hash, which is more dynamic in term of array dimensions:

    data anagram;

    input ( word container ) ( : $100. );

    cards;

    cat     actor

    dinner  thundering

    cab     actor

    num     immunoglobulin

    ;

    run;

    data want;

      if _n_=1 then do;

      declare hash h1(ordered:'y', multidata:'y');

      h1.definekey('letter');

      h1.definedone();

      declare hash h2(ordered:'y', multidata:'y');

      h2.definekey('letter');

      h2.definedone();

      end;

        

      set anagram;

      _rc=h1.clear();

        do _i=1 to lengthn(word);

           letter=substr(word,_i,1);

    _rc=h1.add();

          end;

          do _j=0 to lengthn(container)-lengthn(word);

    _rc=h2.clear();

            do _i=1 to lengthn(word);

           letter=substr(container,_j+_i,1);

    _rc=h2.add();

            end;

    _rc=h1.equals(hash: 'h2', result: _eq);

            if _eq then do; output; return; end;

          end;

          drop _: letter;

    run;


    Haikuo

    Respected Advisor
    Posts: 4,919

    Re: Anagrams Within Words

    In case anybody wondered if pattern matching could be helpful to do this (I did) :

    data want;

    input str :$10. source &:$50.;

    array s{10} $1 _temporary_;

    length ss tt $10;

    do i = 1 to dim(s);

           s{i} = char(str,i);

           end;

    call sortc(of s{*});

    ss = cats(of s{*});

    rid = prxparse(cats("/[",str,"]{",length(str),"}/"));

    start=1;

    call prxnext(rid, start, -1, source, pos, len);

    do while (pos > 0);

           tt = substr(source, pos, len);

           do i = 1 to dim(s);

                s{i} = char(tt,i);

                end;

           call sortc(of s{*});

           if cats(of s{*}) = ss then leave;

           start = pos + 1;

           call prxnext(rid, start, -1, source, pos, len);

           end;

    found = pos > 0;

    keep str source found;

    datalines;

    cat        actor

    car        actor

    dinner  thundering

    num      immunoglobulin

    matt      owns a tamtam

    ;

    proc print data=want noobs; run;

    PG

    PG
    Trusted Advisor
    Posts: 1,301

    Re: Anagrams Within Words

    /* Input Data */

    data ana;

    input (base comp) (:$32.);

    cards;

    cat    actor

    car    actor

    dinner thundering

    num    immunoglobulin

    ;

    run;




    /* Let's compile and store a package for in-word anagram search */

    proc ds2;

    package work.anagram /sas_encrypt=yes overwrite=yes;

    declare double ana;

    method init( nchar(32) base , nchar(32) comp );

    declare double _x _y _z;

    declare nchar(32) _comp;

    this.ana=0;

    do _x = 1 to length( comp )-length( base )+1.0 until( this.ana );

        _comp = substr( comp , _x , length( base ) );

        do _y = 1 to length( base );

            _z = index( _comp , substr( base , _y , 1.0 ) );

            if _z then do; this.ana+1; substr( _comp , _z , 1.0 ) = ''; end;

        end;

        this.ana=(this.ana=length( base ));

    end;

    end;

    method check() returns double;

    return this.ana;

    end;

    endpackage;

    run;

    quit;

    /* Run out data using the anagram package */

    proc ds2;

    data foo (overwrite=yes);

    declare double ana;


    method run();

    declare package work.anagram anagram();

    set ana;

    anagram.init( base , comp );

    ana = anagram.check();

    end;

    enddata;

    run;


    quit;

    Ask a Question
    Discussion stats
    • 9 replies
    • 680 views
    • 6 likes
    • 4 in conversation