Create a program that takes two words as input, for example:
play table
hair ball
The program will find linkage words between the two given:
play/table (playtimes, timestable)
hair/ball (hairpin, pinball)
In unix utilize the dictionary file ( /usr/dict/words or /usr/share/dict/words ) for your seed data.
I have a solution, I wouldn't call it efficient but the code is pretty simple. I will share later in order to give everyone a blank starting point.
EDIT: I have attached a gunzipped version of the dictionary file to this thread for the non-unix folks...
I think it will be easy by using Hash Table.
But without word dictionary /usr/dict/words or /usr/share/dict/words . It is hard to code something.
Ksharp
Ksharp,
SAS has a built in dictionary. You can find it at sashelp.base.master.dictnary
Really?
How do I use it? Is it a table or view or a file?
Ksharp
I attached a copy of the dictionary file to my original posting
OK. It is very interesting.
Thank FriedEgg who offer me a dictionary , I reserve it,maybe it will be useful for future.
data dictionary; infile 'c:\unix-words'; input words : $100.; run; %let first=play; %let last=table; data _null_; length key _key word1 word2 $ 100; declare hash ha(hashexp : 20,dataset : 'work.dictionary(rename=(words=key))'); declare hiter hi('ha'); ha.definekey('key'); ha.definedata('key'); ha.definedone(); rc=hi.first(); do while(rc=0); _key=key; word1=cats("&first",_key); key=word1;r1=ha.check(); word2=cats(_key,"&last");key=word2;r2=ha.check(); if r1=0 and r2=0 then do;put 'Found:' word1 word2; found=1;end; rc=hi.next(); end; if not found then put 'Search over. Not Found.'; stop; run;
Ksharp
Thanks Ksharp for the solution utilizing hash object. It works well. Here is another solution:
%let v1=play;
%let v2=table;
data words(keep=word) word1(keep=word link) word2(keep=word link);
infile '/usr/share/dict/words' truncover;
input word : $45.;
output words;
array v[2] $ 45 _temporary_ ("&v1" "&v2");
do i=1 to dim(v);
if index(word,trim(v))>0 then
do;
link=tranwrd(word,trim(v),'');
if count(trim(link),' ')<2 then
do;
array l[2] $ 45 _temporary_;
do j=1 to dim(l);
l
end;
if length(l[1])>length(l[2]) then link=l[1]; else link=l[2];
if i=1 then output word1; else output word2;
end;
else delete;
end;
end;
run;
proc sql;
create table want as
select a.link ,a.word as word1 ,b.word as word2
from ( select strip(link) as link ,word
from word1
where link in ( select word from words ) ) a,
( select strip(link) as link ,word
from word2
where link in ( select word from words ) ) b
where a.link=b.link;
quit;
My example will not produce the same results. In my original examples of output I made it seem like the words should follow a pattern where a/b -> ac cb but it was unitentional and my solution above allows for more linkages to be found. The hash can be modified to meet the same results, I will do it later if I find the time.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.