"if and THEN" conditional programming.
Dear Community,
I hope this email finds you well. Thank you for all your help. I have learned a lot from the other responses.
I need a little help with some programming,
I am analyzing data and would like to meet some conditions.
I have a group of metabolic syndrome participants and non-metabolic one. I would like to classify them into metabolically healthy normal weight (MHNW), metabolically Unhealthy normal weight (MUNW) and metabolically healthy overweight (MHO) and metabolically unhealthy overweight (MUO).
data Alisama;set Alisama; log_waist=log(Waistcm);/**add log forthe other variables/ /*do it with non normally distributed outcomes*/ log_CRP=log(CRP_High_S); label T2DMMetS_ALL ="T2DM"; met_s=0; if MetS_Total>=3 then met_s=1; array score score0-score5; do i=0 to 5; score[i]=0; end; if Gender_Code_2005=1 & Waist_2015>=94 then score1=1;*codes for men; else if Gender_Code_2005=2 & Waist_2015>=80 then score1=1;*codes for women; if TRIG2015>=1.7 then score2=1; if Gender_Code_2005=1 & HDLC32015<=1 then score3=1;*codes for men; else if Gender_Code_2005=2 & HDLC32015<=1.3 then score3=1;*codes for women; if SBPOmr2015>=130 or DBP2015>=85 or HT_MED2015=1 then score4=1;*codes for men; if Glucose2015>=5.6 or DIAB_MED2015=1 then score5=1;*codes for men; MetS_Total=sum(of score0-score5); met_s2=0;if MetS_Total>=3 then met_s2=1; run;
SAS Output
188 | 69.63 | 188 | 69.63 |
82 | 30.37 | 270 | 100.00 |
SAS Output
24 | 8.89 | 24 | 8.89 |
85 | 31.48 | 109 | 40.37 |
79 | 29.26 | 188 | 69.63 |
52 | 19.26 | 240 | 88.89 |
27 | 10.00 | 267 | 98.89 |
3 | 1.11 | 270 | 100.00 |
I have already classified the metabolical syndrome and the above are the numbers.
However, whatever I do, I have failed to get the qualifications mentioned above right,
The problem is here
data ALLMH05TO15; set ALLMH05TO15; if BMI_2015 <=24.99999 & met_s2=0 then MHNW=0; if BMI_2015 >=25.0 & met_s2=0 then MHO=1; run; if BMI_2015 <=24.99999 & met_s2=1 then MUNW=0; if BMI_2015 >=25.0 & met_s2=1 then MUO=1; run;
SAS Output
171 | 63.33 | 171 | 63.33 |
99 | 36.67 | 270 | 100.00 |
52 | 100.00 | 52 | 100.00 |
20 | 24.39 | 20 | 24.39 |
62 | 75.61 | 82 | 100.00 |
62 | 100.00 | 62 | 100.00 |
So I tried another method..
data ALLMH05TO15;set ALLMH05TO15; if BMI_2015<=24.99999 & MetS_Total_2015=0 then MHNW=0; if BMI_2015<=24.99999 & MetS_Total_2015=1 then MHNW=0; if BMI_2015<=24.99999 & MetS_Total_2015=2 then MHNW=0; if BMI_2015<=24.99999 & MetS_Total_2015>=3 then MUNW=1; if BMI_2015 >=25.0 & MetS_Total_2015=0 then MHNW=1; if BMI_2015 >=25.0 & MetS_Total_2015=1 then MHNW=1; if BMI_2015 >=25.0 & MetS_Total_2015=2 then MHNW=1; if BMI_2015 <=24.99999 & MetS_Total_2015>=3 then MUNW=0; if BMI_2015 >=25.0 & MetS_Total_2015>=3 then MUNW=1; run;
171 | 63.33 | 171 | 63.33 |
99 | 36.67 | 270 | 100.00 |
20 | 24.39 | 20 | 24.39 |
62 | 75.61 | 82 | 100.00 |
I cannot tell from what you post, as always, first step is to provide test data in the form of a datastep.
I suspect your if statement is not capturing all eventualities for instance:
if BMI_2015<=24.99999 & MetS_Total_2015>=3 then MUNW=1; if BMI_2015 >=25.0 & MetS_Total_2015=0
What if bmi_2015 = 24.999991? Neither of those if statements would trigger. One way to capture all would be:
if bmi_2015 <= 24.99999 then do; ... end; else do; ... end;
And nestle your statements. Also, your numbers may contain a very small fraction of a number (due to storage on the machine), so rounding the value may also help. Also, avoid coding in mixed code and use indents so its easy to read the code.
Dear Superuser, sample data attached. Apologies for not attaching earlier. Thank you.
You may need to run this step first. Thank you very much for your help.
data Sample_Ac;set Sample_Ac;
met_s=0;
if MetS_Total>=3 then met_s=1;
array score score1-score5;
do i=1 to 5;
score[i]=0;
end;
if Gender_Code_2005=1 & Waist_2015>=94 then score1=1;*codes for men;
else if Gender_Code_2005=2 & Waist_2015>=80 then score1=1;*codes for women;
if TRIG2015>=1.7 then score2=1;
if Gender_Code_2005=1 & HDLC32015<=1 then score3=1;*codes for men;
else if Gender_Code_2005=2 & HDLC32015<=1.3 then score3=1;*codes for women;
if SBPOmr2015>=130 or DBP2015>=85 or HT_MED2015=1 then score4=1;
if Glucose2015>=5.6 or T2DM2015=1 then score5=1;
score_MS=sum(of score1-score5);
met_s2=0;if score_MS>=3 then met_s2=1;
run;
Proc freq data = Sample_Data; tables met_s2 score_MS;
run;
data Sample_Ac; set Sample_Ac;
rename score_MS= MetS_Total_2015;*rename old=new*;
run;
With appropriate formats for continuous variables this will create a grid that shows which combinations of values were assigned which score and may help find missing or overcounts
proc tabulate data= Alisama; class Gender_Code_2005 Waist_2015 TRIG2015 HDLC32015 SBPOmr2015 DBP2015 HT_MED2015 Glucose2015 DIAB_MED2015 /missing; /* your continuous variables should have a format to indicate the >, < or the range of interest and have a corresponding format assignment statment here */ var score0 - score5; table Gender_Code_2005 * Waist_2015 * TRIG2015 HDLC32015* SBPOmr2015* DBP2015* HT_MED2015 Glucose2015 *DIAB_MED2015 , (score0 score1 score2 score3 score4 score5)*sum='' / misstext=' ' ; run;
And I would be very cautious of habitual use of
data Alisama;set Alisama;
as that has a potential for creating all sorts of chaos as testing this sort of recode if any of the independent variables (waist for instance) has adjustment code.
Sorry, you misunderstand, I don't want to open Excel files, nor do I want to write import programs to try to deal with those or guess your data. I want you to provide, in a datastep, some test data which shows what you have, what the structure of the dataset is, and what you want out at the end. Follow this post if need:
There is nothing I can do, I cannot see your data to tell you where you are going wrong. All I can repeat is that this:
if BMI_2015<=24.99999 & MetS_Total_2015>=3 then MUNW=1; if BMI_2015 >=25.0 & MetS_Total_2015=0 then MHNW=1;
Does not cover all evenetualities - 24.999991 for instance would get flagged. I have suggested other methods to cover all eventualities, have you tried these?
if bmi_2015 <= 24.99999 ... do; ... end; else do; ... end;
Without information on what you are using I cannot do anything.
Hello Superuser,
I don't know what I am doing wrong, I about to die, I have been unable to solve this problem since last week.
I have tried what you advised. So I went ahead and categorized the two BMI categories.
data ALLMH05TO15; set ALLMH05TO15;
if BMI_2015 <= 24.794988736 then BMICAT=0;
if BMI_2015 => 25.004109139 then BMICAT=1;
run;
I then went ahead to classify my metabolic syndrome.
Metabolic healthy with Normal weight (MHNW)
Metabolically healthy overweight/obese (MHO)
Metabolically Unhelathy Normla weight (MUNW)
Metabolically unhealthy overweight/obese (MUO)
in other words, I need to combine the MetS_total_2015 and he BMI category which I have included up there to meet the conditions for the MHNW, MUNW, MHO, MUO.
SAS Output
24 | 8.89 | 24 | 8.89 |
85 | 31.48 | 109 | 40.37 |
79 | 29.26 | 188 | 69.63 |
52 | 19.26 | 240 | 88.89 |
27 | 10.00 | 267 | 98.89 |
3 | 1.11 | 270 | 100.00 |
I have attempted to classify
data ALLMH05TO15; set ALLMH05TO15; if MetS_Total_2015 <= 2 & BMICAT=0 then MHNW=0; if MetS_Total_2015 =>3 & BMICAT=0 then MUNW=0; if MetS_Total_2015 <= 2 & BMICAT=1 then MHNW=1; if MetS_Total_2015 =>3 & BMICAT=1 then MUNW=1; run;
These are the results I am getting
SAS Output
155 | 57.41 | 155 | 57.41 |
115 | 42.59 | 270 | 100.00 |
20 | 24.39 | 20 | 24.39 |
62 | 75.61 | 82 | 100.00 |
The second set of results are correct (MUNW). however, the first set (MHNW) is wrong, it is including all the data to give me a total of 270, yet I should have a total of 188 and not 270. Please, I hope someone can understand my problem.
I have a feeling the problem lies in the metabolic syndrome category... but I keep getting an error,
1024 data ALLMH05TO15; set ALLMH05TO15; 1025 met_s=0; 1026 if MetS_Total>=3 then met_s=1; 1027 array score score0-score5; 1028 do i=0 to 5; 1029 score[i]=0; 1030 end; 1031 if Gender_Code_2005=1 & Waist_2015>=94 then 1031! score1=1;*codes for men; 1032 else if Gender_Code_2005=2 & Waist_2015>=80 then 1032! score1=1;*codes for women; 1033 1034 if TRIG2015>=1.7 then score2=1; 1035 1036 if Gender_Code_2005=1 & HDLC32015<=1 then score3=1 1036! ;*codes for men; 1037 else if Gender_Code_2005=2 & HDLC32015<=1.3 then score3=1 1037! ;*codes for women; 1038 1039 if SBPOmr2015>=130 or DBP2015>=85 or HT_MED2015=1 then score4=1;*codes for men; 1040 if Glucose2015>=5.6 or DIAB_MED2015=1 then score5=1;*codes for 1040! men; 1041 MetS_Total=sum(of score0-score5); 1042 met_s2=0;if MetS_Total>=3 then met_s2=1; 1043 1044 run; ERROR: Array subscript out of range at line 1029 column 1.
I have no help, I am depending on all of you;).
Thank you
Achieng
Hi everyone,
I managed to resolve the issues. I had to go manually to the data set to find out what was wrong,
I found that some of the variables had Metabolic syndrome classification but were missing BMI values, and others had the BMI status but were missing the metabolic system classification. Once I deleted these, I was able to resolve the problem.
Thank you
very kind regards
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.