DATA Step, Macro, Functions and more

percentages for a list of variables

Accepted Solution Solved
Reply
Regular Contributor
Posts: 196
Accepted Solution

percentages for a list of variables

I have variables for 58 questions (q1,q2,...,q58).
Each variable has values 1,2,3.
I would like to create a dataset with 4 variables (and 58 rows).
The first variable, q, should contain a list of q1,...,q58.
The second variable (x1) is the percentage of 1s for each question.
The third variable (x2) is the percentage of 2s for each question.
The fourth variable (x3) is the percentage of 3s fore each question.

The following code (after transposing the results) gets the answer for q1, but how would I do this for q1-q58 and then transpose the results so that I have 4 columns?

ods output onewayfreqs=percents(keep=q1 percent);
proc freq data=test;
tables q1;
run;
ods output close;

Thank you.


Accepted Solutions
Solution
‎08-17-2011 04:21 PM
PROC Star
Posts: 7,364

percentages for a list of variables

I wouldn't bother with trying to transpose or use any proc to do this.  If I correctly understand what you want, a datastep could do everything and faster and easier than most methods.  E.g.:

%let qmax=5;

data have;

  input q1-q&qmax.;

  cards;

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

1 1 1 1 1

2 2 2 2 2

2 2 2 2 2

;

data want (keep=q xSmiley Happy;

  array _q(*) q1-q&qmax.;

  array _x1(&qmax.);

  array _x2(&qmax.);

  array _x3(&qmax.);

  set have end=last;

  retain _x1: _x2: _x3:;

  do i=1 to &qmax.;

    _x1(i)+(_q(i) eq 1);

    _x2(i)+(_q(i) eq 2);

    _x3(i)+(_q(i) eq 3);

    if i eq 1 then _xtotal+1;

    if last then do;

      q=vname(_q(i));

      x1=_x1(i)/_xtotal;

      x2=_x2(i)/_xtotal;

      x3=_x3(i)/_xtotal;

      output;

    end;

  end;

run;

View solution in original post


All Replies
Super Contributor
Posts: 578

Re: percentages for a list of variables

This code:

data test;

input q $ ans;

cards;

q1 1

q1 2

q1 3

q1 4

q1 1

q2 1

q2 2

q3 1

;

run;

proc sql;

create table summary as

select

      q,

      sum(case when ans=1 then responses else 0 end)/count(*) as Pct_1,

      sum(case when ans=2 then responses else 0 end)/count(*) as Pct_2,

      sum(case when ans=3 then responses else 0 end)/count(*) as Pct_3,

      sum(case when ans=4 then responses else 0 end)/count(*) as Pct_4

from

      (select q, ans, count(*) as Responses from work.test group by q, ans)

group by q;

select * from summary;

quit;

produces this output:

qPct_1Pct_2Pct_3Pct_4
q10.50.250.250.25
q20.50.500
q31000
Regular Contributor
Posts: 196

Re: percentages for a list of variables

The dataset I'm using has one variable for each of the 58 questions (plus some other variables). For various reasons I'm not able to transpose it so that it has two variables like the dataset you created for your example.

Do you know how to find the percentages (x1, x2, x3) for each of the 58 questions and then create four columns (q, x1, x2, x3)?

Super Contributor
Posts: 578

percentages for a list of variables

I don't understand what your data look like.  Can you give a quick example?

Regular Contributor
Posts: 196

Re: percentages for a list of variables

I have some other variables that I plan to subset the data set with (departments and divisions), but these are the variables I wish to create percentages for (x1, x2, x3).

q1 q2 q3 .... q58

1  1    2  .....  3

2  3    1  ...... 2

1  1    3  ...... 2

The output you created is exactly what I would like my output to look like.

Solution
‎08-17-2011 04:21 PM
PROC Star
Posts: 7,364

percentages for a list of variables

I wouldn't bother with trying to transpose or use any proc to do this.  If I correctly understand what you want, a datastep could do everything and faster and easier than most methods.  E.g.:

%let qmax=5;

data have;

  input q1-q&qmax.;

  cards;

1 1 1 1 1

2 2 2 2 2

3 3 3 3 3

1 1 1 1 1

2 2 2 2 2

2 2 2 2 2

;

data want (keep=q xSmiley Happy;

  array _q(*) q1-q&qmax.;

  array _x1(&qmax.);

  array _x2(&qmax.);

  array _x3(&qmax.);

  set have end=last;

  retain _x1: _x2: _x3:;

  do i=1 to &qmax.;

    _x1(i)+(_q(i) eq 1);

    _x2(i)+(_q(i) eq 2);

    _x3(i)+(_q(i) eq 3);

    if i eq 1 then _xtotal+1;

    if last then do;

      q=vname(_q(i));

      x1=_x1(i)/_xtotal;

      x2=_x2(i)/_xtotal;

      x3=_x3(i)/_xtotal;

      output;

    end;

  end;

run;

Super Contributor
Posts: 578

percentages for a list of variables

Thanks pretty neat..expanding on that for this specifics:

This code:

data test;

input

      q1 q2 q3 q4 q5 q6 q7 q8 q9 q10

      q11 q12 q13 q14 q15 q16 q17 q18 q19 q20

      q21 q22 q23 q24 q25 q26 q27 q28 q29 q30

      q31 q32 q33 q34 q35 q36 q37 q38 q39 q40

      q41 q42 q43 q44 q45 q46 q47 q48;

cards;

1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3

4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

;

run;

data want (keep=q xSmiley Happy;

  array _q(*) q1-q48;

  array _x1(48);

  array _x2(48);

  array _x3(48);

  array _x4(48);

  set test end=last;

  retain _x1: _x2: _x3: _x4: ;

  do i=1 to 48;

    _x1(i)+(_q(i) eq 1);

    _x2(i)+(_q(i) eq 2);

    _x3(i)+(_q(i) eq 3);

      _x4(i)+(_q(i) eq 4);

    if i eq 1 then _xtotal+1;

    if last then do;

      q=vname(_q(i));

      x1=_x1(i)/_xtotal;

      x2=_x2(i)/_xtotal;

      x3=_x3(i)/_xtotal;

        x4=_x4(i)/_xtotal;

      output;

    end;

  end;

run;

produces this output:

q1 0.4 0.2 0.2 0.2

q2 0.2 0.4 0.2 0.2

q3 0.2 0.2 0.4 0.2

q4 0.2 0.2 0.2 0.4

q5 0.4 0.2 0.2 0.2

q6 0.2 0.4 0.2 0.2

q7 0.2 0.2 0.4 0.2

q8 0.2 0.2 0.2 0.4

q9 0.4 0.2 0.2 0.2

q10 0.2 0.4 0.2 0.2

q11 0.2 0.2 0.4 0.2

q12 0.2 0.2 0.2 0.4

q13 0.4 0.2 0.2 0.2

q14 0.2 0.4 0.2 0.2

q15 0.2 0.2 0.4 0.2

q16 0.2 0.2 0.2 0.4

q17 0.4 0.2 0.2 0.2

q18 0.2 0.4 0.2 0.2

q19 0.2 0.2 0.4 0.2

q20 0.2 0.2 0.2 0.4

q21 0.4 0.2 0.2 0.2

q22 0.2 0.4 0.2 0.2

q23 0.2 0.2 0.4 0.2

q24 0.2 0.2 0.2 0.4

q25 0.4 0.2 0.2 0.2

q26 0.2 0.4 0.2 0.2

q27 0.2 0.2 0.4 0.2

q28 0.2 0.2 0.2 0.4

q29 0.4 0.2 0.2 0.2

q30 0.2 0.4 0.2 0.2

q31 0.2 0.2 0.4 0.2

q32 0.2 0.2 0.2 0.4

q33 0.4 0.2 0.2 0.2

q34 0.2 0.4 0.2 0.2

q35 0.2 0.2 0.4 0.2

q36 0.2 0.2 0.2 0.4

q37 0.4 0.2 0.2 0.2

q38 0.2 0.4 0.2 0.2

q39 0.2 0.2 0.4 0.2

q40 0.2 0.2 0.2 0.4

q41 0.4 0.2 0.2 0.2

q42 0.2 0.4 0.2 0.2

q43 0.2 0.2 0.4 0.2

q44 0.2 0.2 0.2 0.4

q45 0.4 0.2 0.2 0.2

q46 0.2 0.4 0.2 0.2

q47 0.2 0.2 0.4 0.2

q48 0.2 0.2 0.2 0.4

Respected Advisor
Posts: 3,777

Re: percentages for a list of variables

I like the simplicity and flexibility of PROC/DATA/ PROC.

data test;

   input (q1-q48)(f1.);

   format q: f1.;

   cards;

123412341234123412341234123412341234123412341234

222222222222222222222222222222222222222222222222

333333333333333333333333333333333333333333333333

444444444444444444444444444444444444444444444444

111111111111111111111111111111111111111111111111

;;;;

   run;

ods output onewayfreqs=freq;

ods listing off;

proc freq;

   tables q:;

   run;

ods listing;

data freq;

   set freq;

   length vname $32;

   vname = scan(table,-1);

   resp  = coalesce(of qSmiley Happy;

   format resp f1.;

   keep vname resp Percent;

   run;

proc transpose prefix=R out=_data_(drop=_Smiley Happy;

   by vname notsorted;

   var percent;

   id resp;

   idlabel resp;

   run;

proc print label;

   run;

Trusted Advisor
Posts: 1,300

Re: percentages for a list of variables

data temp;

input id q1 q2 q3;

cards;

1 1 1 1

2 2 2 2

3 3 3 3

4 4 4 4

5 1 2 3

6 2 3 4

;

run;

data _null_;

do i=1 to 3;

  call execute(

   'proc freq data=temp;' ||

   ' tables q' || strip(i) || '/out=percent' || strip(i) || '(keep=percent) noprint;' ||

   'run;'

  );

  call execute(

   'proc transpose data=percent' || strip(i) || '(rename=(percent=q' || strip(i) || '))' ||

   ' out=ptran' || strip(i) || '(drop=_label_ rename=(_name_=question)) prefix=pct;' ||

   'run;'

  );

end;

run;

data percent;

set ptran:;

run;

could also be done in a macro, but meh...

EDIT: Just had an additional thought to the above.  You could significantly reduce the I/O requirements of the frequency step by replace the multiple calls of the frequency procedure by instead creating multiple tables statements instead...  Gains would become noticeable with increasing quantities/variables.

Super User
Posts: 9,691

Re: percentages for a list of variables

Almost like _null_'s code:

data test;
input  q1-q48;
cards;
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
;
run;
ods output onewayfreqs=freq;
proc freq data=test ;
  tables q: /nofreq nocum nocol ;
run;
data temp(keep=tname level percent);
 set freq(drop=f_: );
 tname=scan(table,2);
 level=coalesce(of q:);
run;
proc sort data=temp;by tname;run;
proc transpose data=temp out=want(drop=_name_) prefix=x ;
 by tname;
 id level;
 var percent;
run;
data want;
 set want;
 flag=input(scan(tname,1,,'kd'),best8.);
run;
proc sort data=want;by flag;run;

Ksharp

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 229 views
  • 3 likes
  • 6 in conversation