Hello programmers
I want to do a Proc Univariate by tc tn so that i can get N Mean Std for BMI and the dichotomized heart variables (heart1-hear6).
I want to present continuous variables like BMI as N Mean Std.
For the heart variables , i want to do something like a/N %, where a is the number of yes's, N is the total number of non-missings and %= a/N *100 = the percentage of yes's.
In the end i want something like this
Study Population
a/N (%)
data me;
input id sex race agecat health bmi bmigroup heart1 heart2 heart3 heart4 heart5 heart6 death tc Tn $ value;
datalines;
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 1 Angina 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 2 Heartburn 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 3 Sleepiness 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 4 Exercise 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 5 Palpitation 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 6 Any 1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 1 Angina 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 2 Heartburn 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 3 Sleepiness 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 4 Exercise 1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 5 Palpitation 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 6 Any 1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 1 Angina 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 2 Heartburn 1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 3 Sleepiness 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 4 Exercise 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 5 Palpitation 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 6 Any 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 1 Angina 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 2 Heartburn 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 3 Sleepiness 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 4 Exercise 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 5 Palpitation 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 6 Any 1
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 1 Angina 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 2 Heartburn 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 3 Sleepiness 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 4 Exercise 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 5 Palpitation 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 6 Any 0
; run;
proc print; run;
/*format and stringing out*/
Proc format;
value testcodefm
1= 'Angina'
2= 'Heartburn'
3= 'Sleepiness'
4= 'Exercise'
5= 'Palpitation'
6= 'Any'
7= 'BMI'
;
run;
data You;
set me;
array him heart1-heart6 BMI;
do over him;
tc = _I_;
tn = put(tc,testcodefm.);
value = him;
output;
end;
keep;
run;
proc print data=you;
run;
proc univariate data= you; by tc tn;
var heart1 heart2 heart3 heart4 heart5 heart6 BMI
output out= three n=n mean=mean median=median std=std;
data four; set three;
newvar= (mean/N)* 100;
run;
proc print data= four; var newvar n;
run;
@ChuksManuel wrote:
It worked for all the dichotomous heart variables except BMI.
I was hoping that the BMI will would appear as (mean +Std) but it appeared as a percentage instead.
Is there a way i could rewrite the BMI to be in the (mean+std) format?
Sure. I don't have a lot of time at the moment, but it would look something like this (where you can fill in the proper variable names and formats):
data _null_;
file print;
set five;
pct=count/n;
if tc<=6 then put tc testcodefm. +3 count '/' n '(' pct percentn7.1 ')';
else if tc=7 then put tc testcodefm. +3 meanvariable '+' stdvariable ;
run;
Hi, it's great that you posted your data in a data step, that is the best way to provide data, and we appreciate your effort.
A simple data step can produce the printed report you want.
data _null_;
file print;
set four;
pct=newvar/n;
put tc testcodefm. +3 newvar '/' n '(' pct percentn7.1 ')';
run;
data me;
input id sex race agecat health bmi bmigroup heart1 heart2 heart3 heart4 heart5 heart6 death tc Tn $ value;
datalines;
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 1 Angina 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 2 Heartburn 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 3 Sleepiness 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 4 Exercise 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 5 Palpitation 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 6 Any 1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 1 Angina 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 2 Heartburn 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 3 Sleepiness 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 4 Exercise 1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 5 Palpitation 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 6 Any 1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 1 Angina 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 2 Heartburn 1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 3 Sleepiness 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 4 Exercise 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 5 Palpitation 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 6 Any 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 1 Angina 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 2 Heartburn 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 3 Sleepiness 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 4 Exercise 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 5 Palpitation 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 6 Any 1
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 1 Angina 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 2 Heartburn 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 3 Sleepiness 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 4 Exercise 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 5 Palpitation 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 6 Any 0
; run;
proc print; run;
/*format and stringing out*/
Proc format;
value testcodefm
1= 'Angina'
2= 'Heartburn'
3= 'Sleepiness'
4= 'Exercise'
5= 'Palpitation'
6= 'Any'
7= 'BMI'
;
run;
data You;
set me;
array him heart1-heart6 BMI;
do over him;
tc = _I_;
tn = put(tc,testcodefm.);
value = him;
output;
end;
run;
proc sort data=you; by tc tn;
run;
proc means data= you stackods n mean median std; by tc tn;
var heart1 heart2 heart3 heart4 heart5 heart6 BMI;
ods output summary = three;
run;
data four; set three;
newvar= (mean/N)* 100;
run;
proc print data=four;
format _numeric_ 8.3;
run;
Combine this with the answer from @PaigeMiller to get your formatted results. This works for me.
It worked for all the dichotomous heart variables except BMI.
I was hoping that the BMI will would appear as (mean +Std) but it appeared as a percentage instead.
Is there a way i could rewrite the BMI to be in the (mean+std) format?
/*format and stringing out*/
Proc format;
value testcodefm
1= 'Angina'
2= 'Heartburn'
3= 'Sleepiness'
4= 'Exercise'
5= 'Palpitation'
6= 'Any'
7= 'BMI'
;
run;
data You;
set me;
array him heart1-heart6 BMI;
do over him;
tc = _I_;
tn = put(tc,testcodefm.);
value = him;
output;
end;
run;
proc print data=you;
run;
Proc sort; by tc tn;
Proc univariate ; by tc tn;
Var value;
Output out =three n=n mean=p std =std; run;
Proc print; var tc tn n p std;
Run;
data four; set three;
Count= N*P;
run;
proc print; run;
data five; set four;
percent= (Count/N)*100;
run;
proc print; run;
data _null_;
file print;
set five;
pct=count/n;
put tc testcodefm. +3 count '/' n '(' pct percentn7.1 ')';
run;
@ChuksManuel wrote:
It worked for all the dichotomous heart variables except BMI.
I was hoping that the BMI will would appear as (mean +Std) but it appeared as a percentage instead.
Is there a way i could rewrite the BMI to be in the (mean+std) format?
Sure. I don't have a lot of time at the moment, but it would look something like this (where you can fill in the proper variable names and formats):
data _null_;
file print;
set five;
pct=count/n;
if tc<=6 then put tc testcodefm. +3 count '/' n '(' pct percentn7.1 ')';
else if tc=7 then put tc testcodefm. +3 meanvariable '+' stdvariable ;
run;
Thanks Paige!
Hello PaigeMiller,
I have a similar problem i posted in the programming section.
Please can you take a look at it? I am trying to write the code based off this example.
Sincerely,
CHucks
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.