BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SAS93
Quartz | Level 8

I'm trying to use the 2017 BRFSS data to look at healthcare coverage (HLTHPLN1).

 

The CDC has the HLTHPLN1 coded by:

1=Yes

2= no

7=don't know

9=refused

 

I want to only look at YES and NO, so I'm trying to get rid of/delete those observations where HLTHPLN1=7 or 9. I've tried using the WHERE statement to just select those I'm interested in. I've tried the IF/Delete statement. I've tried with and without quotation marks.

 

Here is my code:

Libname BRFSS '/folders/myfolders/';
Run;
Data BRFSS.BRF17HP;
	Set BRFSS.BRF17;
	If=(HLTHPLN1=1 or HLTHPLN1=2);
	If=(SEX=1 or SEX=2);
Run;
Proc sort data=BRFSS.BRF17HP;
	By SEX;
Proc surveyfreq data=BRFSS.BRF17;
	table HLTHPLN1;
	Strata _STSTR;
	Cluster _PSU;
	Weight _LLCPWT;
Run;

Here is log:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         
 74         Libname BRFSS '/folders/myfolders/';
 NOTE: Libref BRFSS was successfully assigned as follows: 
       Engine:        V9 
       Physical Name: /folders/myfolders
 75         Run;
 76         Data BRFSS.BRF17HP;
 77         Set BRFSS.BRF17;
 78         If=(HLTHPLN1=1 or HLTHPLN1=2);
 79         If=(SEX=1 or SEX=2);
 80         Run;
 
 NOTE: There were 5843 observations read from the data set BRFSS.BRF17.
 NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables.
 NOTE: DATA statement used (Total process time):
       real time           1.79 seconds
       cpu time            1.79 seconds
       
 
 81         Proc sort data=BRFSS.BRF17HP;
 82         By SEX;
 
 NOTE: There were 5843 observations read from the data set BRFSS.BRF17HP.
 NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables.
 NOTE: PROCEDURE SORT used (Total process time):
       real time           2.47 seconds
       cpu time            2.78 seconds
       
 
 83         Proc surveyfreq data=BRFSS.BRF17;
 84         table HLTHPLN1;
 85         Strata _STSTR;
 86         Cluster _PSU;
 87         Weight _LLCPWT;
 88         Run;
 
 NOTE: PROCEDURE SURVEYFREQ used (Total process time):
       real time           0.21 seconds
       cpu time            0.00 seconds
       
 
 89         
 90         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 102        

 
User: sasdemo
DATA Step Statements

And here's the final output:

 

Untitled.jpg

Clearly, my approach isn't working, because the categories for 7 and 9 are still there.

 

What am I doing wrong, and how can I fix it?

1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

In addition to what @SASKiwi said, I think the problem is here:

 

Proc surveyfreq data=BRFSS.BRF17

 

I think you want

 

Proc surveyfreq data=BRFSS.BRF17HP;

 

--
Paige Miller

View solution in original post

5 REPLIES 5
SASKiwi
PROC Star

Please consider one of your statements:

If=(HLTHPLN1=1 or HLTHPLN1=2);

What this actually does is create a variable called IF that is set to 1 if HLTHPLN1=1 or HLTHPLN1=2, and 0 if not. Is doesn't filter your data at all. What I suspect you want is:

If (HLTHPLN1=1 or HLTHPLN1=2); * Subsetting IF;

where (HLTHPLN1=1 or HLTHPLN1=2); * Subsetting WHERE;

BTW, the brackets aren't necessary. You will get the same answer by removing them. 

PaigeMiller
Diamond | Level 26

In addition to what @SASKiwi said, I think the problem is here:

 

Proc surveyfreq data=BRFSS.BRF17

 

I think you want

 

Proc surveyfreq data=BRFSS.BRF17HP;

 

--
Paige Miller
SAS93
Quartz | Level 8

Gosh darn it, the devil is always in the details! Thank you! I just needed a new pair of eyes looking at it apparently.

 

New code (that produces a matching output to the official one that I'm trying to check myself against):

 

Libname BRFSS '/folders/myfolders/';
Run;
Data BRFSS.BRF17HP;
	Set BRFSS.BRF17;
	If HLTHPLN1=1 then PLN='yes';
	If HLTHPLN1=2 then PLN='no';
	If SEX=1 then GEN='male';
	If SEX=2 then GEN='female';
Run;
Proc sort data=BRFSS.BRF17HP;
	By GEN;
Proc surveyfreq data=BRFSS.BRF17HP;
	table PLN;
	Strata _STSTR;
	Cluster _PSU;
	Weight _LLCPWT;
Run;


To also mention, the IF statement vs WHERE statement is a really good tip--I either forgot that concept or was never taught the difference. Thank you!

ballardw
Super User

@SAS93 wrote:

I'm trying to use the 2017 BRFSS data to look at healthcare coverage (HLTHPLN1).

 

The CDC has the HLTHPLN1 coded by:

1=Yes

2= no

7=don't know

9=refused

 

I want to only look at YES and NO, so I'm trying to get rid of/delete those observations where HLTHPLN1=7 or 9. I've tried using the WHERE statement to just select those I'm interested in. I've tried the IF/Delete statement. I've tried with and without quotation marks.

 

Here is my code:

Libname BRFSS '/folders/myfolders/';
Run;
Data BRFSS.BRF17HP;
	Set BRFSS.BRF17;
	If=(HLTHPLN1=1 or HLTHPLN1=2);
	If=(SEX=1 or SEX=2);
Run;
Proc sort data=BRFSS.BRF17HP;
	By SEX;
Proc surveyfreq data=BRFSS.BRF17;
	table HLTHPLN1;
	Strata _STSTR;
	Cluster _PSU;
	Weight _LLCPWT;
Run;

Here is log:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         
 74         Libname BRFSS '/folders/myfolders/';
 NOTE: Libref BRFSS was successfully assigned as follows: 
       Engine:        V9 
       Physical Name: /folders/myfolders
 75         Run;
 76         Data BRFSS.BRF17HP;
 77         Set BRFSS.BRF17;
 78         If=(HLTHPLN1=1 or HLTHPLN1=2);
 79         If=(SEX=1 or SEX=2);
 80         Run;
 
 NOTE: There were 5843 observations read from the data set BRFSS.BRF17.
 NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables.
 NOTE: DATA statement used (Total process time):
       real time           1.79 seconds
       cpu time            1.79 seconds
       
 
 81         Proc sort data=BRFSS.BRF17HP;
 82         By SEX;
 
 NOTE: There were 5843 observations read from the data set BRFSS.BRF17HP.
 NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables.
 NOTE: PROCEDURE SORT used (Total process time):
       real time           2.47 seconds
       cpu time            2.78 seconds
       
 
 83         Proc surveyfreq data=BRFSS.BRF17;
 84         table HLTHPLN1;
 85         Strata _STSTR;
 86         Cluster _PSU;
 87         Weight _LLCPWT;
 88         Run;
 
 NOTE: PROCEDURE SURVEYFREQ used (Total process time):
       real time           0.21 seconds
       cpu time            0.00 seconds
       
 
 89         
 90         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 102        

 
User: sasdemo
DATA Step Statements

And here's the final output:

 

Untitled.jpg

Clearly, my approach isn't working, because the categories for 7 and 9 are still there.

 

What am I doing wrong, and how can I fix it?


Something else that you want to consider are the CDC added variables they use for analysis with the BRFSS data. These will have names that start with an _ . Depending on the analysis you want to perform you may find some things, such as age groups and some of the values that use 2 or more variables to determine such as BMI categories, already coded in the data. Labels are your friend.

Cynthia_sas
SAS Super FREQ
Hi:
The syntax of your IF statement is incorrect.

Well, is it an IF statement? Or are you trying to create a new variable called IF? Or are you trying to subset the original data? And with a subsetting IF, it's not clear what you want to do.

Do you want to ONLY get something like this:
If HLTHPLN1 in (1,2) and
SEX in (1,2);
or you could code it as
If (HLTHPLN1 =1 or HLTHPLN1=2) and
(SEX =1 or SEX=2);

But your current syntax is probably creating a new variable called IF in the output data (which is why there's no error) and you can confirm that by running a PROC CONTENTS.

Hope this helps,
Cynthia

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1896 views
  • 2 likes
  • 5 in conversation