Solved: percentages for a list of variables

gzr2mz39 · Posted 08-17-2011 02:48 PM

I have variables for 58 questions (q1,q2,...,q58).
Each variable has values 1,2,3.
I would like to create a dataset with 4 variables (and 58 rows).
The first variable, q, should contain a list of q1,...,q58.
The second variable (x1) is the percentage of 1s for each question.
The third variable (x2) is the percentage of 2s for each question.
The fourth variable (x3) is the percentage of 3s fore each question.

The following code (after transposing the results) gets the answer for q1, but how would I do this for q1-q58 and then transpose the results so that I have 4 columns?

ods output onewayfreqs=percents(keep=q1 percent);
proc freq data=test;
tables q1;
run;
ods output close;

Thank you.

art297 · Posted 08-17-2011 04:21 PM

I wouldn't bother with trying to transpose or use any proc to do this. If I correctly understand what you want, a datastep could do everything and faster and easier than most methods. E.g.:

%let qmax=5;

data have;

input q1-q&qmax.;

cards;

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

1 1 1 1 1

2 2 2 2 2

;

data want (keep=q x:);

array _q(*) q1-q&qmax.;

array _x1(&qmax.);

array _x2(&qmax.);

array _x3(&qmax.);

set have end=last;

retain _x1: _x2: _x3:;

do i=1 to &qmax.;

_x1(i)+(_q(i) eq 1);

_x2(i)+(_q(i) eq 2);

_x3(i)+(_q(i) eq 3);

if i eq 1 then _xtotal+1;

if last then do;

q=vname(_q(i));

x1=_x1(i)/_xtotal;

x2=_x2(i)/_xtotal;

x3=_x3(i)/_xtotal;

output;

end;

run;

View solution in original post

DBailey · Posted 08-17-2011 03:45 PM

This code:

data test;

input q $ ans;

cards;

q1 1

q1 2

q1 3

q1 4

q1 1

q2 1

q2 2

q3 1

;

run;

proc sql;

create table summary as

select

q,

sum(case when ans=1 then responses else 0 end)/count(*) as Pct_1,

sum(case when ans=2 then responses else 0 end)/count(*) as Pct_2,

sum(case when ans=3 then responses else 0 end)/count(*) as Pct_3,

sum(case when ans=4 then responses else 0 end)/count(*) as Pct_4

from

(select q, ans, count(*) as Responses from work.test group by q, ans)

group by q;

select * from summary;

quit;

produces this output:

q	Pct_1	Pct_2	Pct_3	Pct_4
q1	0.5	0.25	0.25	0.25
q2	0.5	0.5	0	0
q3	1	0	0	0

gzr2mz39 · Posted 08-17-2011 04:02 PM

The dataset I'm using has one variable for each of the 58 questions (plus some other variables). For various reasons I'm not able to transpose it so that it has two variables like the dataset you created for your example.

Do you know how to find the percentages (x1, x2, x3) for each of the 58 questions and then create four columns (q, x1, x2, x3)?

DBailey · Posted 08-17-2011 04:03 PM

I don't understand what your data look like. Can you give a quick example?

gzr2mz39 · Posted 08-17-2011 04:20 PM

I have some other variables that I plan to subset the data set with (departments and divisions), but these are the variables I wish to create percentages for (x1, x2, x3).

q1 q2 q3 .... q58

1 1 2 ..... 3

2 3 1 ...... 2

1 1 3 ...... 2

The output you created is exactly what I would like my output to look like.

art297 · Posted 08-17-2011 04:21 PM

I wouldn't bother with trying to transpose or use any proc to do this. If I correctly understand what you want, a datastep could do everything and faster and easier than most methods. E.g.:

%let qmax=5;

data have;

input q1-q&qmax.;

cards;

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

1 1 1 1 1

2 2 2 2 2

;

data want (keep=q x:);

array _q(*) q1-q&qmax.;

array _x1(&qmax.);

array _x2(&qmax.);

array _x3(&qmax.);

set have end=last;

retain _x1: _x2: _x3:;

do i=1 to &qmax.;

_x1(i)+(_q(i) eq 1);

_x2(i)+(_q(i) eq 2);

_x3(i)+(_q(i) eq 3);

if i eq 1 then _xtotal+1;

if last then do;

q=vname(_q(i));

x1=_x1(i)/_xtotal;

x2=_x2(i)/_xtotal;

x3=_x3(i)/_xtotal;

output;

end;

run;

DBailey · Posted 08-17-2011 04:35 PM

Thanks pretty neat..expanding on that for this specifics:

This code:

data test;

input

q1 q2 q3 q4 q5 q6 q7 q8 q9 q10

q11 q12 q13 q14 q15 q16 q17 q18 q19 q20

q21 q22 q23 q24 q25 q26 q27 q28 q29 q30

q31 q32 q33 q34 q35 q36 q37 q38 q39 q40

q41 q42 q43 q44 q45 q46 q47 q48;

cards;

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

;

run;

data want (keep=q x:);

array _q(*) q1-q48;

array _x1(48);

array _x2(48);

array _x3(48);

array _x4(48);

set test end=last;

retain _x1: _x2: _x3: _x4: ;

do i=1 to 48;

_x1(i)+(_q(i) eq 1);

_x2(i)+(_q(i) eq 2);

_x3(i)+(_q(i) eq 3);

_x4(i)+(_q(i) eq 4);

if i eq 1 then _xtotal+1;

if last then do;

q=vname(_q(i));

x1=_x1(i)/_xtotal;

x2=_x2(i)/_xtotal;

x3=_x3(i)/_xtotal;

x4=_x4(i)/_xtotal;

output;

end;

run;

produces this output:

q1 0.4 0.2 0.2 0.2

q2 0.2 0.4 0.2 0.2

q3 0.2 0.2 0.4 0.2

q4 0.2 0.2 0.2 0.4

q5 0.4 0.2 0.2 0.2

q6 0.2 0.4 0.2 0.2

q7 0.2 0.2 0.4 0.2

q8 0.2 0.2 0.2 0.4

q9 0.4 0.2 0.2 0.2

q10 0.2 0.4 0.2 0.2

q11 0.2 0.2 0.4 0.2

q12 0.2 0.2 0.2 0.4

q13 0.4 0.2 0.2 0.2

q14 0.2 0.4 0.2 0.2

q15 0.2 0.2 0.4 0.2

q16 0.2 0.2 0.2 0.4

q17 0.4 0.2 0.2 0.2

q18 0.2 0.4 0.2 0.2

q19 0.2 0.2 0.4 0.2

q20 0.2 0.2 0.2 0.4

q21 0.4 0.2 0.2 0.2

q22 0.2 0.4 0.2 0.2

q23 0.2 0.2 0.4 0.2

q24 0.2 0.2 0.2 0.4

q25 0.4 0.2 0.2 0.2

q26 0.2 0.4 0.2 0.2

q27 0.2 0.2 0.4 0.2

q28 0.2 0.2 0.2 0.4

q29 0.4 0.2 0.2 0.2

q30 0.2 0.4 0.2 0.2

q31 0.2 0.2 0.4 0.2

q32 0.2 0.2 0.2 0.4

q33 0.4 0.2 0.2 0.2

q34 0.2 0.4 0.2 0.2

q35 0.2 0.2 0.4 0.2

q36 0.2 0.2 0.2 0.4

q37 0.4 0.2 0.2 0.2

q38 0.2 0.4 0.2 0.2

q39 0.2 0.2 0.4 0.2

q40 0.2 0.2 0.2 0.4

q41 0.4 0.2 0.2 0.2

q42 0.2 0.4 0.2 0.2

q43 0.2 0.2 0.4 0.2

q44 0.2 0.2 0.2 0.4

q45 0.4 0.2 0.2 0.2

q46 0.2 0.4 0.2 0.2

q47 0.2 0.2 0.4 0.2

q48 0.2 0.2 0.2 0.4

data_null__ · Posted 08-17-2011 08:09 PM

I like the simplicity and flexibility of PROC/DATA/ PROC.

data test;

input (q1-q48)(f1.);

format q: f1.;

cards;

123412341234123412341234123412341234123412341234

222222222222222222222222222222222222222222222222

333333333333333333333333333333333333333333333333

444444444444444444444444444444444444444444444444

111111111111111111111111111111111111111111111111

;;;;

run;

ods output onewayfreqs=freq;

ods listing off;

proc freq;

tables q:;

run;

ods listing;

data freq;

set freq;

length vname $32;

vname = scan(table,-1);

resp = coalesce(of q:);

format resp f1.;

keep vname resp Percent;

run;

proc transpose prefix=R out=_data_(drop=_:);

by vname notsorted;

var percent;

id resp;

idlabel resp;

run;

proc print label;

run;

FriedEgg · Posted 08-17-2011 04:53 PM

data temp;

input id q1 q2 q3;

cards;

1 1 1 1

2 2 2 2

3 3 3 3

4 4 4 4

5 1 2 3

6 2 3 4

;

run;

data _null_;

do i=1 to 3;

call execute(

'proc freq data=temp;' ||

' tables q' || strip(i) || '/out=percent' || strip(i) || '(keep=percent) noprint;' ||

'run;'

);

call execute(

'proc transpose data=percent' || strip(i) || '(rename=(percent=q' || strip(i) || '))' ||

' out=ptran' || strip(i) || '(drop=_label_ rename=(_name_=question)) prefix=pct;' ||

'run;'

);

end;

run;

data percent;

set ptran:;

run;

could also be done in a macro, but meh...

EDIT: Just had an additional thought to the above. You could significantly reduce the I/O requirements of the frequency step by replace the multiple calls of the frequency procedure by instead creating multiple tables statements instead... Gains would become noticeable with increasing quantities/variables.

Ksharp · Posted 08-18-2011 02:42 AM

Almost like _null_'s code:

data test;
input  q1-q48;
cards;
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
;
run;
ods output onewayfreqs=freq;
proc freq data=test ;
  tables q: /nofreq nocum nocol ;
run;
data temp(keep=tname level percent);
 set freq(drop=f_: );
 tname=scan(table,2);
 level=coalesce(of q:);
run;
proc sort data=temp;by tname;run;
proc transpose data=temp out=want(drop=_name_) prefix=x ;
 by tname;
 id level;
 var percent;
run;
data want;
 set want;
 flag=input(scan(tname,1,,'kd'),best8.);
run;
proc sort data=want;by flag;run;

Ksharp

percentages for a list of variables

percentages for a list of variables

Re: percentages for a list of variables

Re: percentages for a list of variables

percentages for a list of variables

Re: percentages for a list of variables

percentages for a list of variables

percentages for a list of variables

Re: percentages for a list of variables

Re: percentages for a list of variables

Re: percentages for a list of variables

SAS Innovate 2025: Call for Content

Classroom Training Available!