I'm trying to use the 2017 BRFSS data to look at healthcare coverage (HLTHPLN1).
The CDC has the HLTHPLN1 coded by:
1=Yes
2= no
7=don't know
9=refused
I want to only look at YES and NO, so I'm trying to get rid of/delete those observations where HLTHPLN1=7 or 9. I've tried using the WHERE statement to just select those I'm interested in. I've tried the IF/Delete statement. I've tried with and without quotation marks.
Here is my code:
Libname BRFSS '/folders/myfolders/'; Run; Data BRFSS.BRF17HP; Set BRFSS.BRF17; If=(HLTHPLN1=1 or HLTHPLN1=2); If=(SEX=1 or SEX=2); Run; Proc sort data=BRFSS.BRF17HP; By SEX; Proc surveyfreq data=BRFSS.BRF17; table HLTHPLN1; Strata _STSTR; Cluster _PSU; Weight _LLCPWT; Run;
Here is log:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72
73
74 Libname BRFSS '/folders/myfolders/';
NOTE: Libref BRFSS was successfully assigned as follows:
Engine: V9
Physical Name: /folders/myfolders
75 Run;
76 Data BRFSS.BRF17HP;
77 Set BRFSS.BRF17;
78 If=(HLTHPLN1=1 or HLTHPLN1=2);
79 If=(SEX=1 or SEX=2);
80 Run;
NOTE: There were 5843 observations read from the data set BRFSS.BRF17.
NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables.
NOTE: DATA statement used (Total process time):
real time 1.79 seconds
cpu time 1.79 seconds
81 Proc sort data=BRFSS.BRF17HP;
82 By SEX;
NOTE: There were 5843 observations read from the data set BRFSS.BRF17HP.
NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 2.47 seconds
cpu time 2.78 seconds
83 Proc surveyfreq data=BRFSS.BRF17;
84 table HLTHPLN1;
85 Strata _STSTR;
86 Cluster _PSU;
87 Weight _LLCPWT;
88 Run;
NOTE: PROCEDURE SURVEYFREQ used (Total process time):
real time 0.21 seconds
cpu time 0.00 seconds
89
90 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
102
User: sasdemo
DATA Step StatementsAnd here's the final output:
Clearly, my approach isn't working, because the categories for 7 and 9 are still there.
What am I doing wrong, and how can I fix it?
In addition to what @SASKiwi said, I think the problem is here:
Proc surveyfreq data=BRFSS.BRF17
I think you want
Proc surveyfreq data=BRFSS.BRF17HP;
Please consider one of your statements:
If=(HLTHPLN1=1 or HLTHPLN1=2);
What this actually does is create a variable called IF that is set to 1 if HLTHPLN1=1 or HLTHPLN1=2, and 0 if not. Is doesn't filter your data at all. What I suspect you want is:
If (HLTHPLN1=1 or HLTHPLN1=2); * Subsetting IF;
where (HLTHPLN1=1 or HLTHPLN1=2); * Subsetting WHERE;
BTW, the brackets aren't necessary. You will get the same answer by removing them.
In addition to what @SASKiwi said, I think the problem is here:
Proc surveyfreq data=BRFSS.BRF17
I think you want
Proc surveyfreq data=BRFSS.BRF17HP;
Gosh darn it, the devil is always in the details! Thank you! I just needed a new pair of eyes looking at it apparently.
New code (that produces a matching output to the official one that I'm trying to check myself against):
Libname BRFSS '/folders/myfolders/'; Run; Data BRFSS.BRF17HP; Set BRFSS.BRF17; If HLTHPLN1=1 then PLN='yes'; If HLTHPLN1=2 then PLN='no'; If SEX=1 then GEN='male'; If SEX=2 then GEN='female'; Run; Proc sort data=BRFSS.BRF17HP; By GEN; Proc surveyfreq data=BRFSS.BRF17HP; table PLN; Strata _STSTR; Cluster _PSU; Weight _LLCPWT; Run;
To also mention, the IF statement vs WHERE statement is a really good tip--I either forgot that concept or was never taught the difference. Thank you!
@SAS93 wrote:
I'm trying to use the 2017 BRFSS data to look at healthcare coverage (HLTHPLN1).
The CDC has the HLTHPLN1 coded by:
1=Yes
2= no
7=don't know
9=refused
I want to only look at YES and NO, so I'm trying to get rid of/delete those observations where HLTHPLN1=7 or 9. I've tried using the WHERE statement to just select those I'm interested in. I've tried the IF/Delete statement. I've tried with and without quotation marks.
Here is my code:
Libname BRFSS '/folders/myfolders/'; Run; Data BRFSS.BRF17HP; Set BRFSS.BRF17; If=(HLTHPLN1=1 or HLTHPLN1=2); If=(SEX=1 or SEX=2); Run; Proc sort data=BRFSS.BRF17HP; By SEX; Proc surveyfreq data=BRFSS.BRF17; table HLTHPLN1; Strata _STSTR; Cluster _PSU; Weight _LLCPWT; Run;Here is log:
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 72 73 74 Libname BRFSS '/folders/myfolders/'; NOTE: Libref BRFSS was successfully assigned as follows: Engine: V9 Physical Name: /folders/myfolders 75 Run; 76 Data BRFSS.BRF17HP; 77 Set BRFSS.BRF17; 78 If=(HLTHPLN1=1 or HLTHPLN1=2); 79 If=(SEX=1 or SEX=2); 80 Run; NOTE: There were 5843 observations read from the data set BRFSS.BRF17. NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables. NOTE: DATA statement used (Total process time): real time 1.79 seconds cpu time 1.79 seconds 81 Proc sort data=BRFSS.BRF17HP; 82 By SEX; NOTE: There were 5843 observations read from the data set BRFSS.BRF17HP. NOTE: The data set BRFSS.BRF17HP has 5843 observations and 359 variables. NOTE: PROCEDURE SORT used (Total process time): real time 2.47 seconds cpu time 2.78 seconds 83 Proc surveyfreq data=BRFSS.BRF17; 84 table HLTHPLN1; 85 Strata _STSTR; 86 Cluster _PSU; 87 Weight _LLCPWT; 88 Run; NOTE: PROCEDURE SURVEYFREQ used (Total process time): real time 0.21 seconds cpu time 0.00 seconds 89 90 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK; 102 User: sasdemo DATA Step StatementsAnd here's the final output:
Clearly, my approach isn't working, because the categories for 7 and 9 are still there.
What am I doing wrong, and how can I fix it?
Something else that you want to consider are the CDC added variables they use for analysis with the BRFSS data. These will have names that start with an _ . Depending on the analysis you want to perform you may find some things, such as age groups and some of the values that use 2 or more variables to determine such as BMI categories, already coded in the data. Labels are your friend.
Dive into keynotes, announcements and breakthroughs on demand.
Explore Now →SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.