Q1: Inside a car when other people are present, do you think that smoking should: (n=132811)
(1) Always be allowed (n=4770)
(2) Be allowed under some conditions (n=25805)
(3) Never be allowed (n=102236)
If they answer (3), they are skip logic'd past the next question. If they answer (1) or (2), they are given the follow up question:
Q2: If children are present inside the car, do you think smoking should: (n=31023)
(1) Always be allowed (n=1189)
(2) Be allowed under some conditions (n=4592)
(3) Never be allowed (n=25242)
We decided that for analysis of Q2, we should carry Q1's answer (3) "never be allowed" to be included with Q2's answer (3) "never be allowed" since if they were never allowed in the first scenario, they are technically also never allowed in the second. However, when I do this, the new count looks like this for Q2 with differences in green:
NEWQ2: If children are present inside the car, do you think smoking should: (n=133259)
(1) Always be allowed (n=1189)
(2) Be allowed under some conditions (n=4592)
(3) Never be allowed (n=127478)
I do not know where to start to figure out where the additional counts are coming from once I add in the new condition. The code I'm using for this should be pretty simple where PEK6h is Q1 and PEK6h2 is Q2 and those other two variables are for my recodes.
If PEK6h=1 Then ATSMCARO=1; /* Always allowed */
Else If PEK6h=2 Then ATSMCARO=2; /* Be allowed under some conditions */
Else If PEK6h=3 Then ATSMCARO=3; /* Never be allowed */
If PEK6h2=1 Then ATSMCARC=1; /* Always allowed */
Else If PEK6h2=2 Then ATSMCARC=2; /* Be allowed under some conditions */
Else If PEK6h2=3 or PEK6h=3 Then ATSMCARC=3; /* Never be allowed */
Suggest running:
proc freq data=have;
tables Q1*Q2*NewQ2 /missing list;
run;
That should give you a nice table of counts, allowing you trace how each combination of Q1 and Q2 values maps to your derived NewQ2 value.
Before you included the Q1 "never be allowed" responses in the Q2 "never be allowed" responses, your Q2 total response had N=31,023.
But the Q2 question was asked only of the Q1 "always be allowed" (N=4,770) and Q1 "under some conditions" (N=25,805). Those are presumably the only respondents given the second question, but that only adds up to 30,575. So how is it that the Q2 sample exceeds that total?
Is it possible that some Q1 "never be allowed" respondents had already slipped into the Q2 sample? If so, any subsequent addition of all the Q1 "never be allowed" respondents to Q2 would generate double counting of some respondents.
Suggest running:
proc freq data=have;
tables Q1*Q2*NewQ2 /missing list;
run;
That should give you a nice table of counts, allowing you trace how each combination of Q1 and Q2 values maps to your derived NewQ2 value.
This worked thank you! It turned out that when I looked at the remaining two options other than "never be allowed", more people actually answered it in the follow up about children compared to the original so that was what was throwing me off.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.