I want to make a temporary dataset with smokers who are overweight and who were in the alendronate treatment group. However I keep receiving an error message. What am I doing wrong?
libname hw3 '\\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Week 3'; data fit1temp; set hw3.fit2; if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1; else if upcase(compress(smoke)) = "NONSMOKER" or upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0; if blvfx = "yes" or blnspfx = "yes" then any_base_frac = 1; else if blvfx = "no" and blnspfx = "no" then any_base_frac = 0; if newvfx = "yes" and newnspfx = "yes" then any_new_frac = 1; else if newvfx = "no" and newnspfx = "no" then any_new_frac = 0; if bmi > 25 then bmi2 = "overweight"; else if 0 < bmi <= 25 then bmi2 = "normal"; if 0 < bmi <= 18 then bmi3 = "underweight"; else if 18 < bmi <= 25 then bmi3 = "normal"; else if bmi > 25 then bmi3 = "overweight"; bmi1_ind = (bmi > 25); bmi2_ind = (18 < bmi <= 25); bmi3_ind = (0 < bmi <= 18); i_tx = (tx = "Alendronate"); diff_bmd = cobmd-blbmd; age = (dov - dob)/365.25; run; data fit1temp_tx; set hw3.fit2; if tx = "Alendronate"; run; data fit1temp_placebo; set hw3.Fit2; if tx = "Placebo"; run; proc sgplot data=fit1temp; vbox cobmd/group=tx; run; proc means data=fit1temp mean median std p25 p75; class tx; var cobmd; run; proc ttest data=fit1temp; class tx; var cobmd; run; data fit1temp_smoke; set hw3.fit2; if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1; else if upcase(compress(smoke)) = "NONSMOKER" or upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0; if currentsmoker = 1; run; proc sgplot data = fit1temp_smoke; vbox cobmd/group=tx; run; proc means data=fit1temp_smoke mean median std p25 p75; class tx; var cobmd; run; proc npar1way data=fit1temp_smoke wilcoxon; class tx; var cobmd; run; data fit1temp_smoke_ovwt; set hw3.fit2; if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1; else if upcase(compress(smoke)) = "NONSMOKER" or upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0; if bmi > 25 then bmi2 = "overweight"; else if 0 < bmi <= 25 then bmi2 = "normal"; if 0 < bmi <= 18 then bmi3 = "underweight"; else if 18 < bmi <= 25 then bmi3 = "normal"; else if bmi > 25 then bmi3 = "overweight"; if currentsmoker = 1; if bmi3 = overweight; diff_bmd = cobmd-blbmd; proc ttest data fit1temp_smoke_ovwt; var diff_bmd; run;
NOTE: Libref HW3 was successfully assigned as follows: Engine: V9 Physical Name: \\apporto.com\dfs\GWU\Users\kennedyhinnant_gwu\Downloads\Week 3 186 data fit1temp; 187 set hw3.fit2; 188 if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1; 189 else if upcase(compress(smoke)) = "NONSMOKER" or 190 upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0; 191 if blvfx = "yes" or blnspfx = "yes" then any_base_frac = 1; 192 else if blvfx = "no" and blnspfx = "no" then any_base_frac = 0; 193 if newvfx = "yes" and newnspfx = "yes" then any_new_frac = 1; 194 else if newvfx = "no" and newnspfx = "no" then any_new_frac = 0; 195 if bmi > 25 then bmi2 = "overweight"; 196 else if 0 < bmi <= 25 then bmi2 = "normal"; 197 if 0 < bmi <= 18 then bmi3 = "underweight"; 198 else if 18 < bmi <= 25 then bmi3 = "normal"; 199 else if bmi > 25 then bmi3 = "overweight"; 200 bmi1_ind = (bmi > 25); 201 bmi2_ind = (18 < bmi <= 25); 202 bmi3_ind = (0 < bmi <= 18); 203 i_tx = (tx = "Alendronate"); 204 diff_bmd = cobmd-blbmd; 205 age = (dov - dob)/365.25; 206 run; NOTE: Missing values were generated as a result of performing an operation on missing values. Each place is given by: (Number of times) at (Line):(Column). 189 at 204:17 NOTE: There were 5580 observations read from the data set HW3.FIT2. NOTE: The data set WORK.FIT1TEMP has 5580 observations and 22 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.01 seconds 207 data fit1temp_tx; 208 set hw3.fit2; 209 if tx = "Alendronate"; 210 run; NOTE: There were 5580 observations read from the data set HW3.FIT2. NOTE: The data set WORK.FIT1TEMP_TX has 2678 observations and 11 variables. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.00 seconds 211 data fit1temp_placebo; 212 set hw3.Fit2; 213 if tx = "Placebo"; 214 run; NOTE: There were 5580 observations read from the data set HW3.FIT2. NOTE: The data set WORK.FIT1TEMP_PLACEBO has 2902 observations and 11 variables. NOTE: DATA statement used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 215 proc sgplot data=fit1temp; 216 vbox cobmd/group=tx; 217 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 0.20 seconds cpu time 0.07 seconds NOTE: There were 5580 observations read from the data set WORK.FIT1TEMP. 218 proc means data=fit1temp mean median std p25 p75; 219 class tx; 220 var cobmd; 221 run; NOTE: There were 5580 observations read from the data set WORK.FIT1TEMP. NOTE: PROCEDURE MEANS used (Total process time): real time 0.01 seconds cpu time 0.01 seconds 222 proc ttest data=fit1temp; 223 class tx; 224 var cobmd; 225 run; NOTE: PROCEDURE TTEST used (Total process time): real time 1.34 seconds cpu time 0.68 seconds 226 data fit1temp_smoke; 227 set hw3.fit2; 228 if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1; 229 else if upcase(compress(smoke)) = "NONSMOKER" or 230 upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0; 231 if currentsmoker = 1; 232 run; NOTE: There were 5580 observations read from the data set HW3.FIT2. NOTE: The data set WORK.FIT1TEMP_SMOKE has 599 observations and 12 variables. NOTE: DATA statement used (Total process time): real time 0.07 seconds cpu time 0.00 seconds 233 proc sgplot data = fit1temp_smoke; 234 vbox cobmd/group=tx; 235 run; NOTE: PROCEDURE SGPLOT used (Total process time): real time 0.12 seconds cpu time 0.04 seconds NOTE: There were 599 observations read from the data set WORK.FIT1TEMP_SMOKE. 236 proc means data=fit1temp_smoke mean median std p25 p75; 237 class tx; 238 var cobmd; 239 run; NOTE: There were 599 observations read from the data set WORK.FIT1TEMP_SMOKE. NOTE: PROCEDURE MEANS used (Total process time): real time 0.04 seconds cpu time 0.03 seconds 240 proc npar1way data=fit1temp_smoke wilcoxon; 241 class tx; 242 var cobmd; 243 run; NOTE: PROCEDURE NPAR1WAY used (Total process time): real time 0.29 seconds cpu time 0.17 seconds 244 data fit1temp_smoke_ovwt; 245 set hw3.fit2; 246 if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1; 247 else if upcase(compress(smoke)) = "NONSMOKER" or 248 upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0; 249 if bmi > 25 then bmi2 = "overweight"; 250 else if 0 < bmi <= 25 then bmi2 = "normal"; 251 if 0 < bmi <= 18 then bmi3 = "underweight"; 252 else if 18 < bmi <= 25 then bmi3 = "normal"; 253 else if bmi > 25 then bmi3 = "overweight"; 254 if currentsmoker = 1; 255 if bmi3 = overweight; 256 diff_bmd = cobmd-blbmd; NOTE: Variable overweight is uninitialized. NOTE: There were 5580 observations read from the data set HW3.FIT2. NOTE: The data set WORK.FIT1TEMP_SMOKE_OVWT has 0 observations and 16 variables. NOTE: DATA statement used (Total process time): real time 0.09 seconds cpu time 0.01 seconds 257 proc ttest data fit1temp_smoke_ovwt; ------------------- 73 ERROR 73-322: Expecting an =. 258 var diff_bmd; 259 run; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE TTEST used (Total process time): real time 0.01 seconds cpu time 0.01 seconds
Looks like you need at least two corrections - see comments:
data fit1temp_smoke_ovwt;
set hw3.fit2;
if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1;
else if upcase(compress(smoke)) = "NONSMOKER" or
upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0;
if bmi > 25 then bmi2 = "overweight";
else if 0 < bmi <= 25 then bmi2 = "normal";
if 0 < bmi <= 18 then bmi3 = "underweight";
else if 18 < bmi <= 25 then bmi3 = "normal";
else if bmi > 25 then bmi3 = "overweight";
if currentsmoker = 1;
if bmi3 = "overweight"; * Quotes required:
diff_bmd = cobmd-blbmd;
proc ttest data = fit1temp_smoke_ovwt; * Equals sign required;
var diff_bmd;
run;
Looks like you need at least two corrections - see comments:
data fit1temp_smoke_ovwt;
set hw3.fit2;
if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1;
else if upcase(compress(smoke)) = "NONSMOKER" or
upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0;
if bmi > 25 then bmi2 = "overweight";
else if 0 < bmi <= 25 then bmi2 = "normal";
if 0 < bmi <= 18 then bmi3 = "underweight";
else if 18 < bmi <= 25 then bmi3 = "normal";
else if bmi > 25 then bmi3 = "overweight";
if currentsmoker = 1;
if bmi3 = "overweight"; * Quotes required:
diff_bmd = cobmd-blbmd;
proc ttest data = fit1temp_smoke_ovwt; * Equals sign required;
var diff_bmd;
run;
@MisterJenn wrote:
I want to make a temporary dataset with smokers who are overweight and who were in the alendronate treatment group. However I keep receiving an error message. What am I doing wrong?
The only error is here:
257 proc ttest data fit1temp_smoke_ovwt; ------------------- 73 ERROR 73-322: Expecting an =. 258 var diff_bmd; 259 run;
Is that not self-explanatory?
It's helpful if you continue from your previous post.
You shouldn't recreate variables multiple times, it's a waste of resources/programming and if you make a mistake you'll have to fix it in multiple places which is often error prone.
I would recommend changing your code as follows:
data fit1temp_smoke_ovwt;
set fit1temp;
where currentsmoker=1 and bmi2='overweight';
run;
I would also recommend changing your previous step:
data fit1temp_smoke;
set hw3.fit2;
if upcase(compress(smoke)) = "CURRENTSMOKER" then currentsmoker = 1;
else if upcase(compress(smoke)) = "NONSMOKER" or
upcase(compress(smoke)) = "FORMERSMOKER" then currentsmoker = 0;
if currentsmoker = 1;
run;
to:
data fit1temp_smoke;
set fit1temp;
where currentsmoker = 1;
run;
Or note that you don't even need to create temporary data sets you can apply the filter to the procs directly.
title 'Smokers Analysis';
proc sgplot data = fit1temp;
where currentsmoker=1;
vbox cobmd/group=tx;
run;
proc means data=fit1temp mean median std p25 p75;
where currentsmoker=1;
class tx;
var cobmd;
run;
proc npar1way data=fit1temp wilcoxon;
where currentsmoker=1;
class tx;
var cobmd;
run;
title 'Smokers & Overweight Analysis';
proc sgplot data = fit1temp;
where currentsmoker=1 and bmi2='overweight';
vbox cobmd/group=tx;
run;
proc means data=fit1temp mean median std p25 p75;
where currentsmoker=1 and bmi2='overweight';
class tx;
var cobmd;
run;
proc npar1way data=fit1temp wilcoxon;
where currentsmoker=1 and bmi2='overweight';
class tx;
var cobmd;
run;
The main thing you are doing wrong is posting the log here, instead of reading it and trying to let it guide you.
Before you get to the error, the log reports two problems. You have zero observations where CURRENTSMOKER is 1. Perhaps the value of SMOKE that you look for ("CURRENTSMOKER") is spelled incorrectly. But only you can investigate that, nobody here can do that for you.
And you try to use a variable named OVERWEIGHT, but your data doesn't contain such a variable. Again, perhaps it is a matter of correct spelling, but for whatever reason it is not in your data. Once again, only you can investigate that.
The only error you get is in PROC TTEST, where you omit the equal sign. The log pretty clearly spells that out and tells you how to correct it.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.