Hello, I want to calculate the percentage of ok/ko of each variable and consolidate all the results into a single table such as the table "WANT"
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
data want;
input varname $ prct_ok prct_ko;
CARDS;
var1 75 25
var2 100 0
var3 50 50
;
run;
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
*recode to 0/1;
data have_view/ view=have_view;
set have;
array _orig(*) $ var1-var3;
array _new(*) new_var1-new_var3;
do i=1 to dim(_orig);
if _orig(i)='ok' then
_new(i)=1;
else
_new(i)=0;
end;
run;
*summarize;
ods select none;
proc means data=have_view N MEAN STACKODS;
var new_var1-new_var3;
ods output summary=summarized;
run;
ods select all;
*clean up for display/formatting;
data want;
set summarized;
PCT_KO=1-mean;
PCT_OK=mean;
format PCT_KO PCT_OK percent11.1;
keep Variable PCT_KO PCT_OK;
run;
proc print data=want;
run;
1. Convert OK/KO to 0/1
2. Summarize using PROC MEANS
3. Format for display
If you had another variable that uniquely identified each row, a PROC TRANSPOSE could be used which would make this slightly simpler but this works as well.
Do you want a data set like this? Or is this for reporting purposes?
Hi @mazouz I am not sure either whether you want a "Report" or a "Dataset" as your WANT.
For what it's worth, It's fun stuff-
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
data _null_ ;
if _n_=1 then do;
dcl hash H (ordered: "A") ;
h.definekey ("varname","value") ;
h.definedata ("varname","value","count","pct") ;
h.definedone () ;
end;
set have nobs=nobs end=z;
array t var1-var3;
do over t;
varname=upcase(vname(t));
value=t;
if h.find()=0 then count=sum(count,1);
else count=1;
pct=divide(count,nobs);
h.replace();
end;
if z;
h.output(dataset:'temp(drop=count)');
run;
proc freq data=temp;
tables varname*value/sparse out=temp2(drop=percent) noprint;
weight pct;
run;
proc transpose data=temp2 out=final_want(drop=_:) prefix=Pct_;
by varname;
id value;
var count;
run;
I want data set not report
Okay Thanks @mazouz Please try the above and see if that works
data have;
input var1 $ var2 $ VAR3 $;
CARDS;
ok ok ko
ko ok ko
ok ok ok
ok ok ok
ok ok ko
ko ok ok
ok ok ok
ok ok ko
;
run;
*recode to 0/1;
data have_view/ view=have_view;
set have;
array _orig(*) $ var1-var3;
array _new(*) new_var1-new_var3;
do i=1 to dim(_orig);
if _orig(i)='ok' then
_new(i)=1;
else
_new(i)=0;
end;
run;
*summarize;
ods select none;
proc means data=have_view N MEAN STACKODS;
var new_var1-new_var3;
ods output summary=summarized;
run;
ods select all;
*clean up for display/formatting;
data want;
set summarized;
PCT_KO=1-mean;
PCT_OK=mean;
format PCT_KO PCT_OK percent11.1;
keep Variable PCT_KO PCT_OK;
run;
proc print data=want;
run;
1. Convert OK/KO to 0/1
2. Summarize using PROC MEANS
3. Format for display
If you had another variable that uniquely identified each row, a PROC TRANSPOSE could be used which would make this slightly simpler but this works as well.
I want to add NOPRINT, I try to add it in proc mean but don't work
I want only data set no report
You can TRANSPOSE, compute and TRANSPOSE
Example:
data have; input var1 $ var2 $ VAR3 $; CARDS; ok ok ko ko ok ko ok ok ok ok ok ok ok ok ko ko ok ok ok ok ok ok ok ko zz ok aa ; run; data haver/view=haver; rownum+1; set have; run; proc transpose data=haver out=stage1; by rownum; var var1-var3; run; proc sql; create table percents as select _name_ label=' ', col1, N/SUM(N)*100 as percent format=5.2 from ( select _name_, COL1, count(*) as N from stage1 group by _name_, COL1 ) as ss group by _name_ ; quit; proc transpose data=percents out=want suffix=_pct; by _name_; id COL1; var percent; run;
data want(keep=varname prct_ok prct_ko);
if _N_ = 1 then do;
dcl hash h();
h.definekey("_I_");
h.definedata("ok", "ko");
h.definedone();
end;
set have end=z;
array v $ var1-var3;
do over v;
rc = h.find();
ok = ifn(v = 'ok', sum(ok, 1), ok);
ko = ifn(v = 'ko', sum(ko, 1), ko);
h.replace();
end;
if z then do over v;
rc = h.find();
varname = vname(v);
prct_ok = max(0, divide(ok, sum(ok, ko)));
prct_ko = max(0, divide(ko, sum(ok, ko)));
output;
end;
run;
Thank you! all solution works but the fast one is the solution using proc means
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.