BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
u49589061
Fluorite | Level 6

Hi everyone, I was curious if someone could help me figure out how to delete the circled values in my table so that only the "normal weight, overweight, obese, and morbidity" columns are the only ones present in my table. I screenshot pictures of my code and output below. Please let me know if you have any questions and thank you in advance for helping me out, I really appreciate it. 

 

 

Screen Shot 2021-04-14 at 10.53.20 PM.png

Screen Shot 2021-04-14 at 10.53.32 PM.png

Screen Shot 2021-04-14 at 10.53.58 PM.png

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Your screen shots are barely legible and can't quite read the circled output values.

BUT I am going to go out on a limb and guess that you created your BMI values of 1 to 4 with code like

 

if BMI le 24.9 then BMI=1;

else if 25 le BMI le 29.9 then BMI=2;

and so on. Which leaves gaps for values like 24.92 which is not in either of the intervals for Bmi class =1 or 2. So you get odd values outside of the explicit ranges that you coded for.

Having worked with BMI more than once you can depending on your source be provided with up to 4 decimal places or calculated from height and weight directly have many more decimals in the result. So either Round you values to fit your code or use a proper format that includes the decimal values (My preference as the format then handles any "rounding" needed regardless of the precision of the BMI range provided.)

 

Instead of creating a category like that with a possibility of having to chase down ever smaller differences in decimal values then you should use a proper range on the values using the actual values.

Proc format;
Value BMI 
18.5< - 24.9= 'Normal Weight, BMI 18.5-24.9'
24.9< - 29.9= 'Overweight, BMI 25-29.9'
29.9< - 34.9= 'Obese, BMI 30-34.9'
34.9< - high= 'Morbidly Obese, BM ge 35'
0   < - 18.5= 'Underweight BMI<18.5'
Run; 

Yes, BMI has categories for underweight and you should consider them.

The < on ether side of - in the value list is an open interval 18.5< then means "any value greater than but not exactly equal to 18.5" falls in this interval.

View solution in original post

5 REPLIES 5
andreas_lds
Jade | Level 19

I don't read screenshots, so please provide code and log as text using "insert sas code" and "insert code" buttons.To be able to help, it is most likely necessary to post an excerpt of the data you are using as data step using datalines.

 

And you will want to change the title of the topic to actually provide information about the problem you have.

u49589061
Fluorite | Level 6
DM 'LOG;CLEAR;OUTPUT;CLEAR;'; 
LIBNAME Final '/home/u49589061/MPBH 423/Data'; 

Proc format;
Value eversmk100cigs 1= '<100 total';
Value BMI 
	1= 'Normal Weight, BMI 18.5-24.9'
	2= 'Overweight, BMI 25-29.9'
	3= 'Obese, BMI 30-34.9'
	4= 'Morbidly Obese, BM ge 35';
Run; 

Proc sort data=final.projectdata2021;
By seqn;
Run;


Data final.p1 (Keep= bmi mortstat ageatinterview gender maritalstatus education smokenow
eversmk100cigs hichol_hx diabetes_hx liver_hx thyroid_hx bpxsar);
Set final.projectdata2021;
By seqn;
Run;

 
Data final.p2; 
set final.p1;
Rename mortstat=TenYearMortality;
Rename ageatinterview=Age;
Rename gender=male;
Rename maritalstatus=Married;
Rename education=Education_Status;
Rename smokenow=CurrentSmoker;
Rename eversmk100cigs=Smoking_History;
Rename hichol_hx=Hypercholesterolemia;
Rename diabetes_hx=Diabetes_mellitus;
Rename liver_hx=Liver_disease;
Rename thyroid_hx=Thyroid_Problem;
Rename Bpxsar=HighSystolic_BloodPressure;

If maritalstatus=1 then maritalstatus=1;
Else if maritalstatus in (2,3,4,5,6) then maritalstatus=.;

If education in (4,5) then education=1;
Else if education in (1,2,3,7,9) then education=.;

If smokenow in (1,2) then smokenow=1;
Else if smokenow=3 then smokenow=.;

If bpxsar>=140;

If BMI=<18.5 then BMI=.;
Else if BMI=>18.5 and BMI=<24.9 then BMI=1;
Else if BMI=>25 and BMI=<29.9 then BMI=2;
Else if BMI=>30 and BMI=<34.9 then BMI=3;
Else if BMI=>35 then BMI=4;
Else BMI=.;

Label 
	mortstat="Ten Year Morality"
	Ageatinterview="Age"
	Gender="Male"
	Maritalstatus="Married"
	Education="Greater than High School Education"
	Smokenow="Current Smoker"
	eversmk100cigs="Smoked >100 cigarettes"
	hichol_hx="Hypercholesterolemia"
	diabetes_hx="Diabetes mellitus"
	liver_hx="Liver disease"
	thyroid_hx="Thyroid Problem"
	Bpxsar="High Systolic Blood Pressure (>140)";

Format eversmk100cigs eversmk100cigs. bmi bmi.;
Run;

Proc contents data=final.p2;
Run;

Proc print data=final.p2 (firstobs=1 obs=50);
Run;



Proc tabulate data=final.p2;
	Class bmi;
	Var age male married education_status CurrentSmoker smoking_history hypercholesterolemia diabetes_mellitus liver_disease
	Thyroid_problem HighSystolic_BloodPressure;
	Table (age male married education_status CurrentSmoker smoking_history hypercholesterolemia diabetes_mellitus liver_disease
	Thyroid_problem HighSystolic_BloodPressure) * (N Mean std colpctn rowpctn), bmi;
	Format bmi bmi.;
	Run;
	
OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         DM 'LOG;CLEAR;OUTPUT;CLEAR;';
 74         LIBNAME Final '/home/u49589061/MPBH 423/Data';
 NOTE: Libref FINAL refers to the same physical library as _TEMP0.
 NOTE: Libref FINAL was successfully assigned as follows: 
       Engine:        V9 
       Physical Name: /home/u49589061/MPBH 423/Data
 75         
 76         Proc format;
 77         Value eversmk100cigs 1= '<100 total';
 NOTE: Format EVERSMK100CIGS is already on the library WORK.FORMATS.
 NOTE: Format EVERSMK100CIGS has been output.
 78         Value BMI
 79         1= 'Normal Weight, BMI 18.5-24.9'
 80         2= 'Overweight, BMI 25-29.9'
 81         3= 'Obese, BMI 30-34.9'
 82         4= 'Morbidly Obese, BM ge 35';
 NOTE: Format BMI is already on the library WORK.FORMATS.
 NOTE: Format BMI has been output.
 83         Run;
 
 NOTE: PROCEDURE FORMAT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              243.65k
       OS Memory           36516.00k
       Timestamp           04/15/2021 05:22:22 AM
       Step Count                        185  Switch Count  0
       Page Faults                       0
       Page Reclaims                     26
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           32
       
 
 84         
 85         Proc sort data=final.projectdata2021;
 NOTE: Data file FINAL.PROJECTDATA2021.DATA is in a format that is native to another host, or the file encoding does not match the 
       session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce 
       performance.
 86         By seqn;
 87         Run;
 
 NOTE: Input data set is already sorted, no sorting done.
 NOTE: PROCEDURE SORT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              1300.75k
       OS Memory           37544.00k
       Timestamp           04/15/2021 05:22:22 AM
       Step Count                        186  Switch Count  0
       Page Faults                       0
       Page Reclaims                     254
       Page Swaps                        0
       Voluntary Context Switches        6
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       
 
 88         
 89         
 90         Data final.p1 (Keep= bmi mortstat ageatinterview gender maritalstatus education smokenow
 91         eversmk100cigs hichol_hx diabetes_hx liver_hx thyroid_hx bpxsar);
 92         Set final.projectdata2021;
 NOTE: Data file FINAL.PROJECTDATA2021.DATA is in a format that is native to another host, or the file encoding does not match the 
       session encoding. Cross Environment Data Access will be used, which might require additional CPU resources and might reduce 
       performance.
 93         By seqn;
 94         Run;
 
 NOTE: There were 55499 observations read from the data set FINAL.PROJECTDATA2021.
 NOTE: The data set FINAL.P1 has 55499 observations and 13 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.18 seconds
       user cpu time       0.15 seconds
       system cpu time     0.01 seconds
       memory              3137.56k
       OS Memory           38828.00k
       Timestamp           04/15/2021 05:22:23 AM
       Step Count                        187  Switch Count  6
       Page Faults                       0
       Page Reclaims                     375
       Page Swaps                        0
       Voluntary Context Switches        150
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           11536
       
 
 95         
 96         
 97         Data final.p2;
 98         set final.p1;
 99         Rename mortstat=TenYearMortality;
 100        Rename ageatinterview=Age;
 101        Rename gender=male;
 102        Rename maritalstatus=Married;
 103        Rename education=Education_Status;
 104        Rename smokenow=CurrentSmoker;
 105        Rename eversmk100cigs=Smoking_History;
 106        Rename hichol_hx=Hypercholesterolemia;
 107        Rename diabetes_hx=Diabetes_mellitus;
 108        Rename liver_hx=Liver_disease;
 109        Rename thyroid_hx=Thyroid_Problem;
 110        Rename Bpxsar=HighSystolic_BloodPressure;
 111        
 112        If maritalstatus=1 then maritalstatus=1;
 113        Else if maritalstatus in (2,3,4,5,6) then maritalstatus=.;
 114        
 115        If education in (4,5) then education=1;
 116        Else if education in (1,2,3,7,9) then education=.;
 117        
 118        If smokenow in (1,2) then smokenow=1;
 119        Else if smokenow=3 then smokenow=.;
 120        
 121        If bpxsar>=140;
 122        
 123        If BMI=<18.5 then BMI=.;
 124        Else if BMI=>18.5 and BMI=<24.9 then BMI=1;
 125        Else if BMI=>25 and BMI=<29.9 then BMI=2;
 126        Else if BMI=>30 and BMI=<34.9 then BMI=3;
 127        Else if BMI=>35 then BMI=4;
 128        Else BMI=.;
 129        
 130        Label
 131        mortstat="Ten Year Morality"
 132        Ageatinterview="Age"
 133        Gender="Male"
 134        Maritalstatus="Married"
 135        Education="Greater than High School Education"
 136        Smokenow="Current Smoker"
 137        eversmk100cigs="Smoked >100 cigarettes"
 138        hichol_hx="Hypercholesterolemia"
 139        diabetes_hx="Diabetes mellitus"
 140        liver_hx="Liver disease"
 141        thyroid_hx="Thyroid Problem"
 142        Bpxsar="High Systolic Blood Pressure (>140)";
 143        
 144        Format eversmk100cigs eversmk100cigs. bmi bmi.;
 145        Run;
 
 NOTE: There were 55499 observations read from the data set FINAL.P1.
 NOTE: The data set FINAL.P2 has 9854 observations and 13 variables.
 NOTE: DATA statement used (Total process time):
       real time           0.04 seconds
       user cpu time       0.01 seconds
       system cpu time     0.01 seconds
       memory              3311.71k
       OS Memory           39852.00k
       Timestamp           04/15/2021 05:22:23 AM
       Step Count                        188  Switch Count  3
       Page Faults                       0
       Page Reclaims                     468
       Page Swaps                        0
       Voluntary Context Switches        100
       Involuntary Context Switches      0
       Block Input Operations            11552
       Block Output Operations           2056
       
 
 146        
 147        Proc contents data=final.p2;
 148        Run;
 
 NOTE: PROCEDURE CONTENTS used (Total process time):
       real time           0.08 seconds
       user cpu time       0.08 seconds
       system cpu time     0.01 seconds
       memory              4090.12k
       OS Memory           38828.00k
       Timestamp           04/15/2021 05:22:23 AM
       Step Count                        189  Switch Count  0
       Page Faults                       0
       Page Reclaims                     356
       Page Swaps                        0
       Voluntary Context Switches        9
       Involuntary Context Switches      0
       Block Input Operations            288
       Block Output Operations           24
       
 
 149        
 150        Proc print data=final.p2 (firstobs=1 obs=50);
 151        Run;
 
 NOTE: There were 50 observations read from the data set FINAL.P2.
 NOTE: PROCEDURE PRINT used (Total process time):
       real time           0.19 seconds
       user cpu time       0.19 seconds
       system cpu time     0.00 seconds
       memory              3249.59k
       OS Memory           38312.00k
       Timestamp           04/15/2021 05:22:23 AM
       Step Count                        190  Switch Count  0
       Page Faults                       0
       Page Reclaims                     280
       Page Swaps                        0
       Voluntary Context Switches        4
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           72
       
 
 152        
 153        
 154        
 155        Proc tabulate data=final.p2;
 156        Class bmi;
 157        Var age male married education_status CurrentSmoker smoking_history hypercholesterolemia diabetes_mellitus liver_disease
 158        Thyroid_problem HighSystolic_BloodPressure;
 159        Table (age male married education_status CurrentSmoker smoking_history hypercholesterolemia diabetes_mellitus
 159      ! liver_disease
 160        Thyroid_problem HighSystolic_BloodPressure) * (N Mean std colpctn rowpctn), bmi;
 161        Format bmi bmi.;
 162        Run;
 
 NOTE: There were 9854 observations read from the data set FINAL.P2.
 NOTE: PROCEDURE TABULATE used (Total process time):
       real time           0.09 seconds
       user cpu time       0.07 seconds
       system cpu time     0.01 seconds
       memory              9368.90k
       OS Memory           45780.00k
       Timestamp           04/15/2021 05:22:23 AM
       Step Count                        191  Switch Count  13
       Page Faults                       0
       Page Reclaims                     2128
       Page Swaps                        0
       Voluntary Context Switches        105
       Involuntary Context Switches      1
       Block Input Operations            1792
       Block Output Operations           1680
       
 
 163        
 164        
 165        
 166        OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 178        

So I was able to run my code without any problems, but I need help grouping my variables into general categories and putting a descriptive title on these categories using proc tabulate. Specifically, grouping the variables "Age, Male, Married, and Greater than high school education" together so that they are all under the Demographics group. Then I want to make a Risk factors and medical hx group that contains "current smoker, smoke >100 cigarettes, hypercholesterolemia, diabetes mellitus, liver disease, and thyroid problem." Then lastly, I want to make a Clinical presentation group that only contains "High systolic blood pressure (>140)" in it. 

ChrisNZ
Tourmaline | Level 20

You have a data quality problem, with rogue values for variable BMI.

 

I really should clean that.  A poorer alternative is to add a where statement such as

where BMI < 5;

in your proc tabulate.

 

ballardw
Super User

Your screen shots are barely legible and can't quite read the circled output values.

BUT I am going to go out on a limb and guess that you created your BMI values of 1 to 4 with code like

 

if BMI le 24.9 then BMI=1;

else if 25 le BMI le 29.9 then BMI=2;

and so on. Which leaves gaps for values like 24.92 which is not in either of the intervals for Bmi class =1 or 2. So you get odd values outside of the explicit ranges that you coded for.

Having worked with BMI more than once you can depending on your source be provided with up to 4 decimal places or calculated from height and weight directly have many more decimals in the result. So either Round you values to fit your code or use a proper format that includes the decimal values (My preference as the format then handles any "rounding" needed regardless of the precision of the BMI range provided.)

 

Instead of creating a category like that with a possibility of having to chase down ever smaller differences in decimal values then you should use a proper range on the values using the actual values.

Proc format;
Value BMI 
18.5< - 24.9= 'Normal Weight, BMI 18.5-24.9'
24.9< - 29.9= 'Overweight, BMI 25-29.9'
29.9< - 34.9= 'Obese, BMI 30-34.9'
34.9< - high= 'Morbidly Obese, BM ge 35'
0   < - 18.5= 'Underweight BMI<18.5'
Run; 

Yes, BMI has categories for underweight and you should consider them.

The < on ether side of - in the value list is an open interval 18.5< then means "any value greater than but not exactly equal to 18.5" falls in this interval.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1592 views
  • 0 likes
  • 4 in conversation