BookmarkSubscribeRSS Feed
deleted_user
Not applicable
Hi All,

I have two columns of data with very noisy information in the second column.

It looks like,

Name title
A ceo&chair
A Chief executive officer
A former ceo
B vp
B senior vp
B executive vp
.........................................
Z cfo
Z chief finance officer
Z chairman

I notice I can use "if then" to create dummies, but I need to apply "contains" at the same time to classify data. "contains" can be used only in WHERE statement (from what I read).

my classification would be like,
if title contains ' ceo' or title contains 'chief executive' then _title=1;
if title contains 'cfo' or title contains 'chief fiance' then _title=2;
if title contains 'vice president' or title contains 'vp' then _title=3;
if title contains 'chairman' or title contains 'chrm' then _title=4;
then the rest wil be categorized as _title=others;

the outcome will look like based on my example,

Name title _title
A ceo&chair 1
A Chief executive officer 1
A former ceo 1
B vp 3
B senior vp 3
B executive vp 3
...........................................................
Z cfo 2
Z chief finance officer 2
Z chairman 4

So, my question, how can I create dummies but apply contain function at the same time?

Thanks very much for sharing your views!
3 REPLIES 3
Rambo
Calcite | Level 5
I think you can use the find function here.

You could change your conditional logic to be something like:

if find(lowcase(title), "ceo") > 0 then _title = 1;
else if find(lowcase(title),'cfo') > 0 or find(lowcase(title),'chief finance') > 0 then _title=2;


SAS Documentation: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002267763.htm
deleted_user
Not applicable
I will try find function and just figure out that probably I can use "isnumber" in excel.
MikeZdeb
Rhodochrosite | Level 12
hi ... building on the suggestion to use FIND, you can also try ..

data x;
input
@1 group $1.
@3 title $50.
;
datalines;
A secretary
A ceo&chair
A Chief executive officer
A former ceo
B vp
B senior vp
B executive vp
C CEO
D VP
Z cfo
Z chief finance officer
Z chairman
Z janitor
;
run;

data xplus;
set x;
t = lowcase(title);
_title = (find(t,'ceo') or find(t,'chief executive')) * 1 +
(find(t,'cfo') or find(t,'chief finance')) * 2 +
(find(t,'vp') or find(t,'vice president')) * 3 +
(find(t,'chrm') or find(t,'chairman')) * 4;
_title = ifn(_title eq 0,_title+5,_title);
drop t;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 621 views
  • 0 likes
  • 3 in conversation