I start out with a dataset containg the following variables
HPV1 HPV2 HPV3 HPV4
1 2 1 2
1 2 1 2
2 1 1 1
2 1 1 2
What I want to create is a dataset that looks as follows
HPV Percent 1 Percent 2 total N
HPV1 50 50 4
HPV2 50 50 4
HPV3 100 0 4
HPV4 25 75 4
I am able to get the percentages I want using proc freq, but I cannot figure out how to create a dataset that contains the data the way I need it organized. I typically output all the individual proc freq for HPV1, HPV2, HPV3 and HPV4 then manually cut and paste and format the data the way I need, but I figured there was probably a way to do this all in SAS. Thank you for any help you can provide.
HyunJee
data have;
input hPV1-HPV4;
cards;
1 2 1 2
1 2 1 2
2 1 1 1
2 1 1 2
;
run;
data _null_ ;
if _n_=1 then set have nobs=nobs;
call symputx('n',nobs);
stop;
run;
%put &n;
%macro test;
%do i=1 %to 4;
proc freq data = have noprint;
tables hpv&i/ out = out;
run;
proc transpose data=out out=out&i (drop=_label_ _name_);
var percent;
id hpv&i;
run;
data out&i;
set out&i;
hpv="hpv&i";
%end;
data want(rename=(_1=percent1 _2=percent2));
set %do i=1 %to 4;
out&i %end;;
if _1=. then _1=0;
if _2=. then _2=0;
total=&n;
%mend;
%test
options nocenter;
proc print data=want ;run;
Obs percent1 percent2 hpv total
1 50 50 hpv1 4
2 50 50 hpv2 4
3 100 0 hpv3 4
4 25 75 hpv4 4
Linlin
data have;
input hPV1-HPV4;
cards;
1 2 1 2
1 2 1 2
2 1 1 1
2 1 1 2
;
run;
data _null_ ;
if _n_=1 then set have nobs=nobs;
call symputx('n',nobs);
stop;
run;
%put &n;
%macro test;
%do i=1 %to 4;
proc freq data = have noprint;
tables hpv&i/ out = out;
run;
proc transpose data=out out=out&i (drop=_label_ _name_);
var percent;
id hpv&i;
run;
data out&i;
set out&i;
hpv="hpv&i";
%end;
data want(rename=(_1=percent1 _2=percent2));
set %do i=1 %to 4;
out&i %end;;
if _1=. then _1=0;
if _2=. then _2=0;
total=&n;
%mend;
%test
options nocenter;
proc print data=want ;run;
Obs percent1 percent2 hpv total
1 50 50 hpv1 4
2 50 50 hpv2 4
3 100 0 hpv3 4
4 25 75 hpv4 4
Linlin
Absolutely nothing wrong with learn and using macros but, in this case, I think the problem is solved a lot easier and quicker obtaining and modifying the ods output. E.g.:
data have;
input hPV1-HPV4;
cards;
1 2 1 2
1 2 1 2
2 1 1 1
2 1 1 2
;
proc freq data=have;
tables _all_;
ods output OneWayFreqs=want;
run;
data want (keep=hpv per: tot:);
set want (rename=(table=hpv percent=_percent));
by hpv notsorted;
array percent(2);
retain percent:;
variable_name = scan(hpv,-1,' ');
if first.hpv then do i=1 to 2;
percent(i)=0;
end;
percent(vvalueX(variable_name))=_percent;
if last.hpv then do;
Total_N=CumFrequency;
hpv=substr(hpv,7);
output;
end;
run;
data have; input HPV1-HPV4; cards; 1 2 1 2 1 2 1 2 2 1 1 1 2 1 1 2 ; run; ods listing close; ods output OneWayFreqs=temp; proc freq data=have; tables _all_/nocum ; run; ods listing; data x(keep=table value frequency percent); set temp; table=scan(table,2); value=coalesce(of hpv:); run; proc transpose data=x out=want1(drop=_name_) prefix=percent; by table; id value; var percent; run; proc means data=x noprint nway; class table; var frequency; output out=want2(drop=_:) sum=total; run; data want(rename=(table=hpv)); merge want1 want2; by table; run;
Ksharp
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.