Hello, I want to calculate the percentage of ok/ko of each variable and consolidate all the results into a single table such as the table "WANT"
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
data want;
input varname $ prct_ok prct_ko;
CARDS;
var1 75 25
var2 100 0
var3 50 50
;
run;
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
*recode to 0/1;
data have_view/ view=have_view;
set have;
array _orig(*) $ var1-var3;
array _new(*) new_var1-new_var3;
do i=1 to dim(_orig);
if _orig(i)='ok' then
_new(i)=1;
else
_new(i)=0;
end;
run;
*summarize;
ods select none;
proc means data=have_view N MEAN STACKODS;
var new_var1-new_var3;
ods output summary=summarized;
run;
ods select all;
*clean up for display/formatting;
data want;
set summarized;
PCT_KO=1-mean;
PCT_OK=mean;
format PCT_KO PCT_OK percent11.1;
keep Variable PCT_KO PCT_OK;
run;
proc print data=want;
run;
1. Convert OK/KO to 0/1
2. Summarize using PROC MEANS
3. Format for display
If you had another variable that uniquely identified each row, a PROC TRANSPOSE could be used which would make this slightly simpler but this works as well.
Do you want a data set like this? Or is this for reporting purposes?
Hi @mazouz I am not sure either whether you want a "Report" or a "Dataset" as your WANT.
For what it's worth, It's fun stuff-
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
data _null_ ;
if _n_=1 then do;
dcl hash H (ordered: "A") ;
h.definekey ("varname","value") ;
h.definedata ("varname","value","count","pct") ;
h.definedone () ;
end;
set have nobs=nobs end=z;
array t var1-var3;
do over t;
varname=upcase(vname(t));
value=t;
if h.find()=0 then count=sum(count,1);
else count=1;
pct=divide(count,nobs);
h.replace();
end;
if z;
h.output(dataset:'temp(drop=count)');
run;
proc freq data=temp;
tables varname*value/sparse out=temp2(drop=percent) noprint;
weight pct;
run;
proc transpose data=temp2 out=final_want(drop=_:) prefix=Pct_;
by varname;
id value;
var count;
run;
I want data set not report
Okay Thanks @mazouz Please try the above and see if that works
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
*recode to 0/1;
data have_view/ view=have_view;
set have;
array _orig(*) $ var1-var3;
array _new(*) new_var1-new_var3;
do i=1 to dim(_orig);
if _orig(i)='ok' then
_new(i)=1;
else
_new(i)=0;
end;
run;
*summarize;
ods select none;
proc means data=have_view N MEAN STACKODS;
var new_var1-new_var3;
ods output summary=summarized;
run;
ods select all;
*clean up for display/formatting;
data want;
set summarized;
PCT_KO=1-mean;
PCT_OK=mean;
format PCT_KO PCT_OK percent11.1;
keep Variable PCT_KO PCT_OK;
run;
proc print data=want;
run;
1. Convert OK/KO to 0/1
2. Summarize using PROC MEANS
3. Format for display
If you had another variable that uniquely identified each row, a PROC TRANSPOSE could be used which would make this slightly simpler but this works as well.
I want to add NOPRINT, I try to add it in proc mean but don't work
I want only data set no report
You can TRANSPOSE, compute and TRANSPOSE
Example:
data have; input var1 $ var2 $ VAR3 $; CARDS; ok ok ko ko ok ko ok ok ok ok ok ok ok ok ko ko ok ok ok ok ok ok ok ko zz ok aa ; run; data haver/view=haver; rownum+1; set have; run; proc transpose data=haver out=stage1; by rownum; var var1-var3; run; proc sql; create table percents as select _name_ label=' ', col1, N/SUM(N)*100 as percent format=5.2 from ( select _name_, COL1, count(*) as N from stage1 group by _name_, COL1 ) as ss group by _name_ ; quit; proc transpose data=percents out=want suffix=_pct; by _name_; id COL1; var percent; run;
data want(keep=varname prct_ok prct_ko);
if _N_ = 1 then do;
dcl hash h();
h.definekey("_I_");
h.definedata("ok", "ko");
h.definedone();
end;
set have end=z;
array v $ var1-var3;
do over v;
rc = h.find();
ok = ifn(v = 'ok', sum(ok, 1), ok);
ko = ifn(v = 'ko', sum(ko, 1), ko);
h.replace();
end;
if z then do over v;
rc = h.find();
varname = vname(v);
prct_ok = max(0, divide(ok, sum(ok, ko)));
prct_ko = max(0, divide(ko, sum(ok, ko)));
output;
end;
run;
Thank you! all solution works but the fast one is the solution using proc means
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.