BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Missmichelle
Fluorite | Level 6

Hello! 

 

I am analyzing a small dataset (N>300) with survey data. There is a section that asked the participants to assign the importance of 20 different health services so "high" "low" or "none". Each service response is stored as its own variable ie: service1 = "High" service2="low" service3= "high". I have already formatted each response to correspond to 1, 2 or 2 instead of the text. My question is what is the most efficient way to display the data? Do I need to transpose is since there are so many variables in my analysis? If so how do I go about it? My end goal is to show a distribution of the responses and to assign a service response score to each individual in the dataset.

 

Thank You!

 

Michelle 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Since we are talking about single digits an alternate approach for counting:

data example;
   input x1 - x10;
   ones = countc(cats(of x:),'1');
   twos = countc(cats(of x:),'2');
   thrs = countc(cats(of x:),'3');
datalines;
1 1 3 2 1 3 2 1 1 1
; 

View solution in original post

6 REPLIES 6
ballardw
Super User

What to do next may depend on what the analysis question(s) you are attempting to answer might be.

 

First, is this a complex survey design with strata and/or clusters and different sample weights between them? That would mean that likely we would need to use the various survey procs to properly use the sampling information.

 

Second are there any outcomes associated with all of the scored variables? Are any of the services considered more important? You might need to weight the individual variables in building your composite "service response score"

 

Several approaches come to mind as possible: Summing the numeric values and then creating histograms of that summed variable would condense things.

Advantage: Easy to code:   Score = sum(of service1-service20); and proc sgplot.

Disadvantage: same total could mask notable differences in sub elements.

 

Do these services have related values? Such as variables related to patient interaction with staff may be grouped separately from actual care services? Likely groups might be created such as with sums and again displayed as histograms or other graph.

Advantage: still easy

Disadvantage: more work on your part identifying the groups

 

 

And then there a group of CLUSTERING procedures to let the data show you groups of responses that are similar.

 

Missmichelle
Fluorite | Level 6

I guess my confusion is in the calculation part. An individual has a response ranging from 1 to 3 for any of the given services. Instead of the sum, I would like to count how many 1's this individual has, how many 2's, and how many 3's rather than the total sum accross. 

Reeza
Super User
Look at PROC SURVEYFREQ.
DWilson
Pyrite | Level 9

@Missmichelle wrote:

Hello! 

 

I am analyzing a small dataset (N>300) with survey data. There is a section that asked the participants to assign the importance of 20 different health services so "high" "low" or "none". Each service response is stored as its own variable ie: service1 = "High" service2="low" service3= "high". I have already formatted each response to correspond to 1, 2 or 2 instead of the text. My question is what is the most efficient way to display the data? Do I need to transpose is since there are so many variables in my analysis? If so how do I go about it? My end goal is to show a distribution of the responses and to assign a service response score to each individual in the dataset.

 

Thank You!

 

Michelle 


 

I would create a single variable for each health service. Each respondent would have a single "response" for each health service variable. At the end of this, each respondent would have 20 health service variables capturing their responses.

 

Once you have that, look at each variable separately with proc freq (or surveyfreq if you have weights and a complex sample design.)

I would then look at the cross-tabulation of all 20 variables and examine the patterns of response. I would look for obvious grouping patterns and report on them.

 

You could also create an aggregate measure, assuming each of your 20 service items are scored the same way (High, Low, None) and, for each respondent, calculate: # of Highs, # of Lows, and # of Nones. I'd then look at the distribution of # of Highs (surveyfreq with weights) so see if there is an obvious split in the distribution of # of Highs. I'd do the same thing for # of Lows and # of Nones. You could also calculate, for each resondent, proportion of "Highs" and use that to classify respondents. You could also use something like: Proportion of Highs minus proportion of Lows or proportion of Highs minus proportion of nones. I'm not sure if None means not applicable or if they are not concerned at all. If it's the former situation then proportion of Highs minus proportion of nones doesn't really make sense.

 

More generally, look into the notion of Likert scores to see about how you might combine responses to 20 items to come up with some aggregate score for an individual. (The ones I gave are simplistic but might work for you.)

 

Oh, once you have the 20 variables for each person; with each variable containing a value of 1, 2, or 3. You can calculate the number of 1s,2s, and 3s for each person in a variety of ways.

 

Here's one:

data mydata;

retain numones numtwos numthrees;

 set mydata;

array values{20} yourvariablename1-yourvariablename20;

do i=1 to 20;

if values{i}=1 then numones=numones+1;

else if values{i}=2 then numtwos=numtwos+1;

else if values{i}=3 then numthrees=numthrees+1;

end;

drop i;

run;

 

In the array statement just list out the names of your 20 health services variables.

ballardw
Super User

Since we are talking about single digits an alternate approach for counting:

data example;
   input x1 - x10;
   ones = countc(cats(of x:),'1');
   twos = countc(cats(of x:),'2');
   thrs = countc(cats(of x:),'3');
datalines;
1 1 3 2 1 3 2 1 1 1
; 
Missmichelle
Fluorite | Level 6

Thank You!! 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 837 views
  • 1 like
  • 4 in conversation