About FriedEgg

FriedEgg · ‎10-28-2011

Here it what I have at this point. It is not exactly what is being done by Peter Norvig, but it is where I am so far. * Based on : http://norvig.com/spell-correct.html; %let word=speling; filename big '/temp/big.txt'; * http://norvig.com/big.txt; filename words '/usr/share/dict/words'; * unix default dictionary (provided in my other word puzzle related posts); data words; infile words truncover; input word $upcase48.; run; data big; length word $48; infile big lrecl=1024 truncover; input @; _infile_=compbl(prxchange('s/[^A-Z]/ /i',-1,_infile_)); if _infile_ ne '' then do i=1 to countw(_infile_,' '); word=upcase(scan(_infile_,i,' ')); if word ne '' then output; end; drop i; run; data words; set words big; word=strip(word); run; proc freq data=words; tables word /list out=wfreq(drop=percent) noprint; run; %macro wf_find; *to avoid repeating this code block below for each correction type; if wf.find()=0 then do; clev=complev(orig_word,word); if clev<=2 then output; end; %mend; data corrections; length word a b c $48; orig_word=upcase("&word"); alphabet='ABCDEFGHIJKLMNOPQRSTUVWXYZ'; if 0 then set wfreq; declare hash wf(hashexp:10,dataset:'wfreq'); declare hiter wfi('wf'); wf.definekey('word'); wf.definedata(all:'Y'); wf.definedone(); *replaces; do i=1 to length(orig_word); do ii=1 to 26; word=orig_word; substr(word,i,1)=substr(alphabet,ii,1); %wf_find end; end; *deletes; do i=1 to length(word); word=orig_word; substr(word,i,1)=''; word=compress(word); %wf_find end; *transposes; do i=1 to length(orig_word)-1; word=orig_word; a=substr(word,i,1); b=substr(word,i+1,1); substr(word,i,1)=b; substr(word,i+1,1)=a; %wf_find end; *inserts; do i=0 to length(orig_word); word=orig_word; a=subpad(word,1,i); b=subpad(word,i+1,length(word)-i); do ii=1 to 26; c=substr(alphabet,ii,1); word=cats(of a c b); %wf_find end; end; *brute - find all words in 'dictionary' that have an edit distance of <= 2, this step should not be necessary because previous method should find all instances, however this is just to be sure; do while(wfi.next()=0); clev=complev(orig_word,word); if clev<=2 then output; end; keep orig_word word count clev; stop; run; proc sql; select distinct 'Did you mean: ' || strip(word) from corrections where clev=( select min(clev) from corrections ) and count=( select max(count) from corrections where clev=( select min(clev) from corrections )); quit;

FriedEgg · ‎10-27-2011

data foo; infile cards dsd dlm='~'; input (city_state type_combined) (:$256.) population; type=prxchange('s/_num$|_den$//i',-1,strip(type_combined)); if not(prxmatch('/_pop$/i',strip(type))) then type=strip(type)||'_Pop'; cards; Collin County, TX~Hispanic_Num~100 Collin County, TX~Hispanic_Den~1000 Collin County, TX~Uninsured_Pop_Num~500 Collin County, TX~Uninsured_Pop_Den~15000 Plano, TX~Hispanic_Num~200 Plano, TX~Hispanic_Den~10000 ; run; data bar; set foo; by city_state type; retain type numerator denominator; if upcase(scan(type_combined,-1,'_'))='NUM' then numerator=population; else denominator=population; if last.type then output; drop type_combined population; run;

FriedEgg · ‎10-27-2011

I still feel that using a more standard technology for cross system vitualization is your proper solution, such as VNC, which is available for all flavors of *nix relativly problem free.

FriedEgg · ‎10-27-2011

data foo; infile '/temp/foo.txt' truncover length=len; input @; if prxmatch('/^1/',_infile_)>0 then row=_infile_; else delete; run;

FriedEgg · ‎10-26-2011

In order to generate a key in SAS that uses sha256 or sha1 you need to license SAS/Secure software. There are ways around this by using other programs. The link you provided is for the FPS (Flexible Payments Service), which is not what you are actually trying to utilize based on the code above. You are trying to interact with the AWSECommerceService. This is the proper documentation for that API you are attempting to interact with: http://docs.amazonwebservices.com/AWSECommerceService/latest/DG/index.html?RequestAuthenticationArticle.html You should make sure you are using for proper secret keys first: http://docs.amazonwebservices.com/AWSECommerceService/latest/DG/index.html?ViewingCredentials.html The process you are missing is outlined here: http://docs.amazonwebservices.com/AWSECommerceService/latest/DG/index.html?rest-signature.html You need to outsource this key generation to a different technology, for instance, perl... Download perl here: http://www.activestate.com/activeperl/downloads Now you need to install the Digest::SHA::PurePerl module (There is another version that is faster but requires C/C++ compilers, this one should present you with fewer problems, unless you know what you are doing). In a command prompt window you want to type ppm install 'Digest::SHA::PurePerl' If you encounter problems try this site: http://www.activestate.com/blog/2010/10/how-install-cpan-modules-activeperl Now that you have perl and the digest module installed you will want to create a perl script to call from SAS that will return the key you want. It should look something like this: (note everything below is written quickly and not tested, at all) #! usr/bin/perl use FileHandle; open INFILE, '<', $ARGV[0] or die "error opening $ARGV[0]: $!"; my $data = do { local $/; <INFILE> }; use Digest::SHA::PurePerl qw(hmac_sha256_base64); $digest=hmac_sha256_base64($data, $ARGV[1]); print $digest; close INFILE; From SAS you would want to call this: data _null_; file 'C:\mytempfile' lrecl=5000 termstr=cr; put 'GET'; put 'ecs.amazonaws.com'; put '/onca/xml'; put 'AWSAccessKeyId=00000000000000000000&ItemId=0679722769&Operation=ItemLookup&ResponseGroup=ItemAttributes%2COffers%2CImages%2CReviews&Service=AWSECommerceService&Timestamp=20090101T12%3A00%3A00Z&Version=2009-01-06'; run; filename key pipe 'C:\path\to\perlscript.pl C:\mytempfile myaws_secretkey'; data foo; infile key; input key $; run; Now you have what you need to build the full url including the HMAC-SHA Signature... Of course a full and proper implementation of this will be much more complex.

FriedEgg · ‎10-26-2011

What are you trying to make this call from? What SAS software do you license? What OS is this running on?

FriedEgg · ‎10-26-2011

Yes, the bestw.d informat is a direct alias of the w.d informat, an important distinction... Outputs from the formats w.d, bestw.d, and bestdw.p for integer values should be virtually identical until the integer values reach extremes and the formats start converting to scientific notation at different points and sometimes with different percision that eachother. The output of the bestw.d and bestdw.p (when the same w is used and default p value) should be identical for all integers.

FriedEgg · ‎10-26-2011

Art, This is totally fine directory naming convention. Give this a try... data _null_; call execute('filename foo pipe "C:\Program Files\sysinternals.com\handle.exe" -u "' || "%sysfunc(pathname(work))" || '";'); run; I do not use windows so I cannot test, but have used something similar to this previously when I did use windows.

FriedEgg · ‎10-25-2011

In this specific case these act as substitutes however there are definitly significant differences between the bestw. and w.d formats.

FriedEgg · ‎10-25-2011

Here is another way to do this, will only work on *nix machines or under windows with cygwin, but per my testing is a slightly faster method. filename cntr04p pipe "wc -l /usr/share/dict/words"; data _null_; infile cntr04p; input cntr04p filename $; call symputx('cntr04p',cntr04p); run;

FriedEgg · ‎10-25-2011

If the recCount number being kept in your dataset is not important that you could instead use the automatic record number counter variable _n_ data x; infile prange_t end=last; input; if last then call symputx('cntr04p',put(_n_,best.)); run; %put &cntr04p;

FriedEgg · ‎10-25-2011

The method in which I am trying this currently uses compged because I am looking only for words with a maximum edit distance of 2 (according to the article).

FriedEgg · ‎10-25-2011

I will explain the expression '/^[A-Z0-9]\S.+$/' ^[A-Z0-9] the start of the string is a letter A-Z or digit 0-9 \S the second character in the string is not a white space type .+$ any character 1 or more times till end of string because we are not striping the padding from the inbound variable this will match any string of more than 1 character even though it seems to imply an excepted length or three or more and not 2 or more. In order to accept these cases with embeded spaces I would modify the expression to the following. prxmatch('/^[A-Z0-9].{1,}$/',strip(id)) ^[A-Z0-9] the start of the string is letter A-Z or digit 0-9 .{1,}$ followed by any character (except new line) 1 or more times without the strip function on the inbound variable id the strings will be padded and match for length even though they may not be more that 1 character long. data foo(where=(prxmatch('/^[A-Z0-9].{1,}$/',strip(id)))); infile cards truncover; input @1 id $10.; cards; 1234-A35 1234567-d45 e453768 1 A123 A 123 A 123 *123456789 *12345 ; run; 1234-A35 1234567-d4 A123 A 123 A 123 like previous to make the expression not case sensitive '/^[A-Z0-9].{1,}$/i' will capture the additional id e453768. This expression is also the same as /^[A-Z0-9].+$/ + means repeat previous statement 1 or more times {n} means repeat exactly n times {n,m} means repeat n to m times {n,} means repeat n or more times so + and {1,} are the same...

FriedEgg · ‎10-24-2011

Here is solution with regex... data foo; infile cards truncover; input @1 id $10.; cards; 1234-A35 1234567-d45 e453768 1 A123 *123456789 *12345 ; run; data bar; set foo; if prxmatch('/^[A-Z0-9]\S.+$/',id)>0; run; Will output 1234-A35 1234567-d4 A123 To add additional case of e453768 change expression to '/^[A-Z0-9]\S.+$/i'

FriedEgg · ‎10-24-2011

If all you are doing here is gather the hostname why not just use the automatic macro variable syshostname?

Online Status	Offline
Date Last Visited	‎03-31-2025 06:28 PM