BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ChuksManuel
Pyrite | Level 9

Hello programmers

 

I want to do a Proc Univariate by tc tn so that i can get N Mean Std for BMI and the dichotomized heart variables (heart1-hear6).

 

I want to present continuous variables like BMI as N Mean Std.

For the heart variables , i want to do something like a/N %, where a is the number of yes's, N is the total number of non-missings and %= a/N *100 = the percentage of yes's.

 

In the end i want something like this

 

                                            Study Population

                                               a/N (%)

  

Angina                                  xxx/xxx (xx.x%)
Heartburn                             xxx/xxx (xx.x%)
Sleepiness                            xxx/xxx (xx.x%)
Exercise                               xxx/xxx (xx.x%) 
Palpitation                           xxx/xxx (xx.x%)
Any                                       xxx/xxx (xx.x%)
BMI                                      xx.x+xx.x
 
I've attached the initial dataset and code for you to understand what i want to implement and please any help, advice  or direction is appreciated. 
 

 

 

data me;
input id sex race agecat health bmi bmigroup heart1 heart2 heart3 heart4 heart5 heart6 death tc Tn $ value;
datalines;
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 1 Angina 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 2 Heartburn 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 3 Sleepiness 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 4 Exercise  0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 5 Palpitation 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 6 Any  1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 1 Angina 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 2 Heartburn 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 3 Sleepiness 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 4 Exercise 1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 5 Palpitation 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 6 Any  1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 1 Angina 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 2 Heartburn 1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 3 Sleepiness 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 4 Exercise  0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 5 Palpitation 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 6 Any 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 1 Angina 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 2 Heartburn 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 3 Sleepiness 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 4 Exercise  1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 5 Palpitation 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 6 Any  1
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 1 Angina 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 2 Heartburn 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 3 Sleepiness 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 4 Exercise  0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 5 Palpitation 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 6 Any  0
; run;
proc print; run;

/*format and stringing out*/
Proc format;
value testcodefm
  1= 'Angina'
  2= 'Heartburn'
  3= 'Sleepiness'
  4= 'Exercise'
  5= 'Palpitation'
  6= 'Any'
  7= 'BMI'
;
run;

data You;
set me;
array him heart1-heart6 BMI;
do over him;
  tc = _I_;
  tn = put(tc,testcodefm.);
  value = him;
  output;
end;
keep; 
run;

proc print data=you;
run;

proc univariate data= you; by tc tn;
var heart1 heart2 heart3 heart4 heart5 heart6 BMI
output out= three n=n mean=mean median=median std=std;
 data four; set three;
 newvar= (mean/N)* 100;
 run;
 proc print data= four; var newvar n;
 run;
1 ACCEPTED SOLUTION

Accepted Solutions
PaigeMiller
Diamond | Level 26

@ChuksManuel wrote:

It worked for all the dichotomous heart variables except BMI. 

I was hoping that the BMI will would appear as (mean +Std) but it appeared as a percentage instead. 

Is there a way i could rewrite the BMI to be  in the (mean+std) format?

 

Sure. I don't have a lot of time at the moment, but it would look something like this (where you can fill in the proper variable names and formats):

 

data _null_;
     file print;
    set five;
    pct=count/n;
    if tc<=6 then put tc testcodefm. +3 count '/' n '(' pct percentn7.1 ')';
    else if tc=7 then put tc testcodefm. +3 meanvariable '+' stdvariable ;
run; 

  

--
Paige Miller

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

Hi, it's great that you posted your data in a data step, that is the best way to provide data, and we appreciate your effort.

 

A simple data step can produce the printed report you want. 

 


data _null_;
     file print;
    set four;
    pct=newvar/n;
     put tc testcodefm. +3 newvar '/' n '(' pct percentn7.1 ')';
run;

 

--
Paige Miller
Reeza
Super User
You're missing a PROC SORT and semicolon in PROC UNIVARIATE, in the VAR statement. Fix the errors in the order they appear.
Reeza
Super User
data me;
input id sex race agecat health bmi bmigroup heart1 heart2 heart3 heart4 heart5 heart6 death tc Tn $ value;
datalines;
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 1 Angina 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 2 Heartburn 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 3 Sleepiness 0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 4 Exercise  0
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 5 Palpitation 1
1 0 0 5 4 22.494 2 0 1 0 0 1 1 1 6 Any  1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 1 Angina 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 2 Heartburn 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 3 Sleepiness 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 4 Exercise 1
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 5 Palpitation 0
2 1 0 5 2 20.1801 1 0 0 0 1 0 1 0 6 Any  1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 1 Angina 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 2 Heartburn 1
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 3 Sleepiness 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 4 Exercise  0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 5 Palpitation 0
3 1 0 4 2 26.3606 4 0 1 0 0 0 1 0 6 Any 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 1 Angina 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 2 Heartburn 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 3 Sleepiness 1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 4 Exercise  1
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 5 Palpitation 0
4 0 0 4 3 25.373 3 0 0 1 1 0 1 0 6 Any  1
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 1 Angina 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 2 Heartburn 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 3 Sleepiness 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 4 Exercise  0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 5 Palpitation 0
6 0 0 4 1 24.8608 3 0 0 0 0 0 0 0 6 Any  0
; run;
proc print; run;

/*format and stringing out*/
Proc format;
value testcodefm
  1= 'Angina'
  2= 'Heartburn'
  3= 'Sleepiness'
  4= 'Exercise'
  5= 'Palpitation'
  6= 'Any'
  7= 'BMI'
;
run;

data You;
set me;
array him heart1-heart6 BMI;
do over him;
  tc = _I_;
  tn = put(tc,testcodefm.);
  value = him;
  output;
end;
run;

proc sort data=you; by tc tn;
run;

proc means  data= you stackods n mean median std; by tc tn;
var heart1 heart2 heart3 heart4 heart5 heart6 BMI;
ods output summary = three;
run;

 data four; set three;
 newvar= (mean/N)* 100;
 run;
 
proc print data=four;
format _numeric_ 8.3;
run;

Combine this with the answer from @PaigeMiller  to get your formatted results. This works for me.

ChuksManuel
Pyrite | Level 9

It worked for all the dichotomous heart variables except BMI. 

I was hoping that the BMI will would appear as (mean +Std) but it appeared as a percentage instead. 

Is there a way i could rewrite the BMI to be  in the (mean+std) format?

 

/*format and stringing out*/
Proc format;
value testcodefm
  1= 'Angina'
  2= 'Heartburn'
  3= 'Sleepiness'
  4= 'Exercise'
  5= 'Palpitation'
  6= 'Any'
  7= 'BMI'
;
run;

data You;
set me;
array him heart1-heart6 BMI;
do over him;
  tc = _I_;
  tn = put(tc,testcodefm.);
  value = him;
  output;
end;
run;

proc print data=you;
run;

Proc sort; by tc tn;

Proc univariate ; by tc tn;
Var value; 
 Output out =three n=n mean=p std =std; run;
Proc print;  var tc tn n p  std;
Run; 

data four; set three;
 Count= N*P;
 run;
proc print; run;

data five; set four;
percent= (Count/N)*100;
run;
proc print; run;
data _null_;
     file print;
    set five;
    pct=count/n;
     put tc testcodefm. +3 count '/' n '(' pct percentn7.1 ')';
run;
Reeza
Super User
Add an IF condition to handle that variable differently with a different put statement.
PaigeMiller
Diamond | Level 26

@ChuksManuel wrote:

It worked for all the dichotomous heart variables except BMI. 

I was hoping that the BMI will would appear as (mean +Std) but it appeared as a percentage instead. 

Is there a way i could rewrite the BMI to be  in the (mean+std) format?

 

Sure. I don't have a lot of time at the moment, but it would look something like this (where you can fill in the proper variable names and formats):

 

data _null_;
     file print;
    set five;
    pct=count/n;
    if tc<=6 then put tc testcodefm. +3 count '/' n '(' pct percentn7.1 ')';
    else if tc=7 then put tc testcodefm. +3 meanvariable '+' stdvariable ;
run; 

  

--
Paige Miller
ChuksManuel
Pyrite | Level 9

Thanks Paige!

ChuksManuel
Pyrite | Level 9

Hello PaigeMiller,

 

I have a similar problem i posted in the programming section.

Please can you take a look at it? I am trying to write the code based off this example.

 

Sincerely,

CHucks

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1446 views
  • 7 likes
  • 3 in conversation