DATA Step, Macro, Functions and more

Output Top 5 Reasons for Survey Answer

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 18
Accepted Solution

Output Top 5 Reasons for Survey Answer

Hi guys,

 

I need help with displaying survey results data.  Part of the survey asked the employees the reason why they wanted to work for this particular organization.  

 

For instance, the Survey question would look like this:

Rate the following reasons why you want to work for this particular organization (best score from 1 to 5):

Closeness to Home12345
Diverse Culture12345
Supportive Environment12345
Etc. etc. 12345

 

Due to data privacy issues, I can only show an example of the dataset using dummy values as follows:

 

EMPIDDepartmentClose_to_HomeDiversitySupportive_Environment
001Knitting352
002Knitting512
003Sewing.15
004Sewing115
005Sewing111

 

I need the show the top reasons for working by department.  This is done by taking the average of scores for each item and ranking the scores.

 

For instance, 

 

For Knitting Department,

Avg Score for Close_to_home = (3+5)/2 = 4;

Avg Diversity = (5+1)/2 = 3;

Avg Support = 2

So the top 2 (or however many) reasons for wanting to work at the organization for this dept is closeness to home and diversity. 

 

So the report would need to show something like:

Top 2 for Knitting Department
1 closeness to home
2 diversity

 

I used Proc Means to calculate the mean score by department, which successfully gives me the calculations.  

 

proc means data = data (keep= id cohort_year &keeplist);
var _numeric_;
output out=calc sum= mean= /autoname;
where cohort_year = 2016;
run;

 

However, I can't figure out how to display the result in the above format.  I could use proc freq, but that only applies to sorting within the same var.  Here we have multiple vars and we are comparing the mean between them.  Should I use Proc Transpose?  

 

I should also add that each var has its own labeling (i.e., close_to_home is already labeled as 'closeness to home').

 

Thank you so much for any advice you can give!


Accepted Solutions
Solution
‎10-11-2016 05:18 PM
Super User
Posts: 11,343

Re: Output Top 5 Reasons for Survey Answer

[ Edited ]

Still not quite sure what you are looking for but one hint might be to include a CLASS variable for your Department. That willl give you the requested summaries for each department (and possibly year if that's of interest and included as class).

 

When you get to stuff (technical term) like "top 3" then you often may need to send the output of one procedure into another procedure or a data step to filter, especially if the rules get complicated.

 

With your example I would likely transpose the data to look like:

EMPID Department Topic                   Value    
001 Knitting Close_to_Home               3  
001 Knitting Diversity                   5 
001 Knitting Supportive_Environment      2 

And then run Proc Means/summary with Department and Topic as CLass variables then sort by the mean of value descending. Then possibly use a data step or Proc Rank to add ranking variables before printing the results with Proc Print.

 

View solution in original post


All Replies
Solution
‎10-11-2016 05:18 PM
Super User
Posts: 11,343

Re: Output Top 5 Reasons for Survey Answer

[ Edited ]

Still not quite sure what you are looking for but one hint might be to include a CLASS variable for your Department. That willl give you the requested summaries for each department (and possibly year if that's of interest and included as class).

 

When you get to stuff (technical term) like "top 3" then you often may need to send the output of one procedure into another procedure or a data step to filter, especially if the rules get complicated.

 

With your example I would likely transpose the data to look like:

EMPID Department Topic                   Value    
001 Knitting Close_to_Home               3  
001 Knitting Diversity                   5 
001 Knitting Supportive_Environment      2 

And then run Proc Means/summary with Department and Topic as CLass variables then sort by the mean of value descending. Then possibly use a data step or Proc Rank to add ranking variables before printing the results with Proc Print.

 

Occasional Contributor
Posts: 18

Re: Output Top 5 Reasons for Survey Answer

[ Edited ]

Thank you so much for your input.

 

I am looking for a final output as follows, one for every department.  We have about 20 depts.

 

Top 2 for Knitting Department
1 closeness to home
2 diversity

 

I think what you suggested should work.  However, reason for wanting to work at the organization are based not only on the three topics I showed in my example but a total of 23 topics.  proc transpose only does one variable at a time, am I correct?  How do I make this more efficient?

 

Thank you for your time!

Super User
Posts: 19,877

Re: Output Top 5 Reasons for Survey Answer

No, you're transposing from wide to long so it will handle multiple variables at once.

 

Try it Smiley Happy

Occasional Contributor
Posts: 18

Re: Output Top 5 Reasons for Survey Answer

Oh!  It worked like a charm! Cat Very Happy  

 

My code is as follows!  Thanks so much to both of you!

 

proc sort data=out;
by dept empid;
run;

proc transpose data=out out=out2;
by dept empid;
id cohort_year;
run;

proc means data=out2 stackodsoutput mean;
class dept _label_;
var _2016;
ods output summary=top5_reason (drop = _control_);
run;

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 355 views
  • 2 likes
  • 3 in conversation