BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
LL2023
Fluorite | Level 6

Hello,

 

I am trying to create a new categorical variable and a new binary variable based off a series of binary variables that I have previously constructed, but the observation numbers are not adding up and I cannot find a solution.

 

An example of the code I am using looks like this:

 

Data newdata;

Set oldata;

If BV1 = 1 then CV = 1;

If BV2 = 1 then CV = 2;

If BV3 = 1 then CV = 3;

Run;

 

The number of observations where BV1 = 1 is 1165, the number of observations where BV2 = 1 is 69, and The number of observations where BV1 = 1 is 17. When I run the above code, the number of observations for each level of the categorical variable are less than the original numbers of the binary variables. I do not understand why these numbers would be changing. The construction of these binary variables were each based on their own variable, so there is no overlap of conditions for each of the binary variables I created.

 

I noticed the same issue if I tried to create a new binary variable indicating positivity for any of the original binary variables I constructed.

 

I have tried code using else if and or statements as follows:

 

Data newdata;

set oldata;

If BV1 = 1 then NBV = 1;

Else if BV2 = 1 then NBV = 1;

Else if BV3 = 1 then NBV = 1;

Run;

 

Data newdata;

set oldata;

If BV1 = 1  or if BV2 = 1 or BV3 = 1 then NBV = 1;

Run;

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User
You likely have cases where multiple are flagged. For example, what happens if you run the following,

proc freq data=newdata;
table (BV1 BV2 BV3)*CV;
run;

Look for cases where the indicator is 1 but CV is not and check and I bet you BV3 =1 in those cases.

data errors;
set oldata;
where bv2=1 and bv3=1;
run;

View solution in original post

5 REPLIES 5
Reeza
Super User
You likely have cases where multiple are flagged. For example, what happens if you run the following,

proc freq data=newdata;
table (BV1 BV2 BV3)*CV;
run;

Look for cases where the indicator is 1 but CV is not and check and I bet you BV3 =1 in those cases.

data errors;
set oldata;
where bv2=1 and bv3=1;
run;
LL2023
Fluorite | Level 6

Thank you! 

 

I hadn't originally checked for overlap between the variables because I did not think it was possible due to their definitions, but it appears I was incorrect.

ballardw
Super User

For a similar diagnostic that may be a bit easier to follow the result:

proc freq data=newdata;
table BV1* BV2* BV3*CV / list missing;
run;

This would show rows with all the BV variables so should be easy to see where your likely overlap occurs.

 

 

 

Data newdata;
set oldata;
If BV1 = 1  or if BV2 = 1 or BV3 = 1 then NBV = 1;
Run;

The ERRORS in the log should explain why the above doesn't work.

83   Data newdata;
84      set have;
85      If BV1 = 1  or if BV2 = 1 or BV3 = 1 then NBV = 1;
                          ---
                          22
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (,
              *, **, +, -, /, ;, <, <=, <>, =, >, ><, >=, AND, EQ, GE, GT,
              IN, LE, LT, MAX, MIN, NE, NG, NL, NOTIN, OR, [, ^=, {, |,
              ||, ~=.

86   Run;

The second IF is not what you want.

If whichn(1, of BV:)>0 then NBV=1;

 

Tests to see if any of the variables have the value of 1. The Whichn returns the number position in the list of the value that matches the first parameter. So if any of the values are 1 then the result is greater than 0.

 

alexluther19
Calcite | Level 5

Hello,

 

I'm hoping you would be able to share how you resolved this? I also have overlapping cases as I try to create a categorical variable from other binary variables that seem to be overwriting each other and causing errors in the frequencies I should be seeing.

 

Thank you.

Reeza
Super User
The solution is subject specific. Please start your own thread with this question if necessary.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 706 views
  • 2 likes
  • 4 in conversation