02-09-2017 12:07 PM
There is a credit hist coulmn that gives information of customer. I want to look at records whose credit is based based on some key terms like
poor,bad,low, delinquent and in various forms. a like operator doesnt serve the purpose as sometimes the words can be shortned by the representative. can this be done in regex, so as i look through the cases i can add the new words in search?
02-09-2017 01:31 PM
data terms; input term : $8. @@; cards; poor bad low negative neg lw ; run; %let terms=&sysnobs; %put &=terms; proc transpose data=terms out=terms_array(drop=_name_) prefix=term; var term; run; data have; str='Lady shows poor judgement. Has good credit.'; output; str='Good Credit'; output; str='This guy has terrible credit!'; output; str='Poor credit'; output; run; data want; set terms_array; array term[&terms]; drop term:; do until (done); set have end=done; do _n_=1 to countw(str); flag=(whichc(lowcase(scan(str,_n_)),of term[*])>0); if flag>0 then leave; end; output; end; stop; run;
02-15-2017 02:02 PM
if there are phrases or compund words like "End of period" or "Introductory APR"?
these words will be passed as indvidual words and not as part of the string that I'm insterested in.
02-15-2017 05:25 PM
May be something like below could do the job for you.
data search_terms; infile datalines truncover; input search_term $100.; search_term=lowcase(search_term); datalines; End of period Introductory APR Good Poor ; run; data have; infile datalines truncover; input sentence $200.; datalines; Lady shows poor judgement. Has good credit. Good Credit This guy has terrible credit! Poor credit blah end of period blah ; run; data want; set have end=last; do _i=1 to nobs; set search_terms nobs=nobs point=_i; if find(sentence,strip(search_term),'it') then leave; call missing(search_term); end; run;
02-09-2017 02:19 PM
I might be over oversimplifying. But if you could look at the data and see if they all start with a disticnt letter that we could be associated with a status we can format it and achieve what you look for. But as I speak this I have no idea of your data.
data test ; input id status $ ; datalines ; 12 Poor 13 bad 15 Del 16 bd 19 lw ; run ; proc sql ; select id, case when upcase(substr(status,1,1)) = 'P' then 'POOR' when upcase(substr(status,1,1)) = 'B' then 'BAD' when upcase(substr(status,1,1)) = 'G' then 'GOOD' when upcase(substr(status,1,1)) = 'D' then 'DELINQUENT' when upcase(substr(status,1,1)) = 'L' then 'LOW' else 'OTHER' end as status1 from test ; quit ;
02-14-2017 10:19 AM
it is not a simple search with one word values in it. It is a text column with 5000 plus characters and need to identify if any of the expected words are present in the text.
02-15-2017 06:27 PM