Solved
New Contributor
Posts: 4

need help creating a binary variable from two categorical variables with if-else and wildcards

[ Edited ]

Hi all,

I'm trying to create a binary (dummy) variable with information from two categorical variables, but I'm having trouble.  I need to use wildcards because var1 has multiple variables that start with the same letters, each of which match var2.  If I need to I will type them all out completely but I feel sure there is a way to do this (and the code for this one bit is already over 100 lines).

``````data sample;
format var1 \$11. var2 \$4.;
input var1 \$ var2 \$;
datalines;
aa1 aaa
aa2 aaa
aa bbb
abdfgdfh bbb
abdfttyrty bbb
abwww bbb
abc aaa
abd aaa
;
data sample;
set sample;
if (var1 = aa1 or aa2) and (var2 = aaa) then match=1;
else if (var1 = ab:) and (var2 = bbb) then match=1;
else if (var1 = ad:) and (var2 = ddd or dd) then match=1;
else match=0;
run;``````

which gives me

ERROR 388-185: Expecting an arithmetic operator.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

I've also tried it like this

``````if var1= 'aa1' or 'aa2' and var2 = 'aaa' then match=1;
else if var1 =: 'ab%' and var2 ='bbb' then match=1;
else if var1 =: 'ad%' and (var2 ='ddd' or 'dd') then match=1;``````

which gives me NOTE: Invalid numeric data, but creates the variable -- with only 1 one, the first observation, and the rest all as zeros.

and like this

``````if (var1= aa1 or aa2) and var2 = aaa then match=1;
else if var1 =: ab% and var2 =bbb then match=1;
else if var1 =: ad% and (var2 =ddd or dd) then match=1;``````

which gives me

ERROR 388-185: Expecting an arithmetic operator.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

Anyone have a clue as to my mistake?

I'm using SAS 9.2 on Windows 7.

``````if var1 in ("aa1" "aa2") and var2 = "aaa" then match=1;
else if var1 =: "ab" and var2 ="bbb" then match=1;
else if var1 =: "ad" and var2 in ("ddd" "dd") then match=1;``````

Accepted Solutions
Solution
‎02-11-2016 01:54 PM
Super User
Posts: 13,583

Re: need help creating a binary variable from two categorical variables with if-else and wildcards

First issue:

if (var1 = aa1 or aa2) is using aa1 and aa2 as VARIABLES, which do not exist

Second you want to reference the Values of concern, text literals require quotes to tell SAS you are looking for specific strings

If you want to see if var1 has a value of either aa1 or aa2 here are two ways:

if (var1 = 'aa1' or var1='aa2')

or

if var1 in ('aa1' 'aa2').

The value vs variable has to be addressed in all of your code.

When you use the ab: construct it is looking for VARIABLES that start with ap, not values. You would use var1 =: 'ab' to look for strings starting with 'ab'.

In future posts with errors please post the log including the procedure or datastep. There are things that tell us which line and likely specific causes that cannot be determined by just posting the error.

All Replies
Super User
Posts: 23,776

Re: need help creating a binary variable from two categorical variables with if-else and wildcards

``if var1 in ("aa1" "aa2") and (var2 = "aaa") then match=1;``

For starters here's one correction. Some similar ones need to be carried through as well.

If you're comparing to a text value you need to include quotes and if checking for multiple variables use IN ()

Another change, the colon goes with the = sign, not the

``(var1 =: "ab") and (var2 = "bbb")``
New Contributor
Posts: 4

Re: need help creating a binary variable from two categorical variables with if-else and wildcards

Thank you, I did not realize that I did not need the % when using =:

Solution
‎02-11-2016 01:54 PM
Super User
Posts: 13,583

Re: need help creating a binary variable from two categorical variables with if-else and wildcards

First issue:

if (var1 = aa1 or aa2) is using aa1 and aa2 as VARIABLES, which do not exist

Second you want to reference the Values of concern, text literals require quotes to tell SAS you are looking for specific strings

If you want to see if var1 has a value of either aa1 or aa2 here are two ways:

if (var1 = 'aa1' or var1='aa2')

or

if var1 in ('aa1' 'aa2').

The value vs variable has to be addressed in all of your code.

When you use the ab: construct it is looking for VARIABLES that start with ap, not values. You would use var1 =: 'ab' to look for strings starting with 'ab'.

In future posts with errors please post the log including the procedure or datastep. There are things that tell us which line and likely specific causes that cannot be determined by just posting the error.

New Contributor
Posts: 4

Re: need help creating a binary variable from two categorical variables with if-else and wildcards

Thank you, this was helpful.  I am no longer getting an error code, but it is not catching all of the matches. Do you see a problem in the following code?

``````if var1 in ("aa1" "aa2") and var2 = "aaa" then match=1;
if var1 =: "ab" and var2 ="bbb" then match=1;
if var1 =: "ad" and var2 in ("ddd" "dd") then match=1;``````
Super User
Posts: 6,785

Re: need help creating a binary variable from two categorical variables with if-else and wildcards

[ Edited ]

There's no problem with that code.  The problem is what you added after that code:

else match=0;

That ELSE applies only to the last statement before it.  If you want to link all the statements together, you have to add "else" a few more times:

``````if var1 in ("aa1" "aa2") and var2 = "aaa" then match=1;
else if var1 =: "ab" and var2 ="bbb" then match=1;
else if var1 =: "ad" and var2 in ("ddd" "dd") then match=1;else match=0;``````

When  you remember those who helped you, please mark the other poster's answer as correct.  He did most of the work and gave you the right tools to use.

New Contributor
Posts: 4