BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

I have a dataset with hundreds of occupations (var occsoc). As you can see below with a listing of the 3 first occupations of the dataset, data lines of the dataset are repeated for each occupation because it goes by numerous elementID's. For each occupation, I am creating an index which sums up the scores of particular element ID's ( '1.B.2.a', '1.B.2.b', '1.B.2.c', '1.B.2.d' - essentially the first four lines of each occupation). These four elements are dimensions of a job characteristic. I want to measure the reliability of the four items using Cronbach's alpha. When I used the following SAS program, I got a Pearson correlation for each separate occupation. I need the alpha value across the occupations - essentially one alpha value for the index. There must be a way to tell SAS how to do that. I tried class statement but it doesn't work. Thanks for your help!! 

 

Proc corr data=onetlib.workvalues ALPHA;
var score;
where elementid in
('1.B.2.a'
'1.B.2.b'
'1.B.2.c'
'1.B.2.d');
by OCCSOC;
run;

 

 

OCCSOC ElementID ScaleID Score
11-1011.00 1.B.2.a EX 6.33
11-1011.00 1.B.2.b EX 6.33
11-1011.00 1.B.2.c EX 7
11-1011.00 1.B.2.d EX 5
11-1011.00 1.B.2.e EX 5.33
11-1011.00 1.B.2.f EX 7
11-1011.00 1.B.2.g VH 3
11-1011.00 1.B.2.h VH 6
11-1011.00 1.B.2.i VH 1
11-1011.03 1.B.2.a EX 6.67
11-1011.03 1.B.2.b EX 6.33
11-1011.03 1.B.2.c EX 6
11-1011.03 1.B.2.d EX 5
11-1011.03 1.B.2.e EX 3.33
11-1011.03 1.B.2.f EX 6.67
11-1011.03 1.B.2.g VH 1
11-1011.03 1.B.2.h VH 6
11-1011.03 1.B.2.i VH 2
11-1021.00 1.B.2.a EX 5.33
11-1021.00 1.B.2.b EX 6
11-1021.00 1.B.2.c EX 5.67
11-1021.00 1.B.2.d EX 6.33
11-1021.00 1.B.2.e EX 4.67
11-1021.00 1.B.2.f EX 6
11-1021.00 1.B.2.g VH 4
11-1021.00 1.B.2.h VH 6
11-1021.00 1.B.2.i VH 2
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Your example data does not have anything to "correlate" with. For a correlation to mean anything there needs to be two variables.

If you do not provide a "with" variable list then the procedure calculates correlations between all of the variables on the VAR statement:

proc corr data=sashelp.class;
   var height weight age;
run;

provides correlations between: height and weight, height and age, and weight and age.

 

If I want to compare specific lists such as height and weight both to only age

proc corr data=sashelp.class;
   var height weight;
   with age;
run;

 

Note the output here

proc corr data=sashelp.class alpha;
   var height weight;
 
run;

You do want one variable for each "type"  of score. Proc transpose does that.

 

 

data have;
input OCCSOC :$11. ElementID $  ScaleID $ Score ;
datalines;
11-1011.00 1.B.2.a EX 6.33 
11-1011.00 1.B.2.b EX 6.33 
11-1011.00 1.B.2.c EX 7 
11-1011.00 1.B.2.d EX 5 
11-1011.00 1.B.2.e EX 5.33 
11-1011.00 1.B.2.f EX 7 
11-1011.00 1.B.2.g VH 3 
11-1011.00 1.B.2.h VH 6 
11-1011.00 1.B.2.i VH 1 
11-1011.03 1.B.2.a EX 6.67 
11-1011.03 1.B.2.b EX 6.33 
11-1011.03 1.B.2.c EX 6 
11-1011.03 1.B.2.d EX 5 
11-1011.03 1.B.2.e EX 3.33 
11-1011.03 1.B.2.f EX 6.67 
11-1011.03 1.B.2.g VH 1 
11-1011.03 1.B.2.h VH 6 
11-1011.03 1.B.2.i VH 2 
11-1021.00 1.B.2.a EX 5.33 
11-1021.00 1.B.2.b EX 6 
11-1021.00 1.B.2.c EX 5.67 
11-1021.00 1.B.2.d EX 6.33 
11-1021.00 1.B.2.e EX 4.67 
11-1021.00 1.B.2.f EX 6 
11-1021.00 1.B.2.g VH 4 
11-1021.00 1.B.2.h VH 6 
11-1021.00 1.B.2.i VH 2 
;
run;

proc transpose data=have (where=( elementid in ('1.B.2.a' '1.B.2.b' '1.B.2.c' '1.B.2.d'))) 
   out=havetrans (drop=_name_);
   by OCCSOC;
   var score;
   id elementid;
run; 

Proc corr data=havetrans ALPHA;
var _: ;

run; 

Because your actual values of elementid do not make valid SAS variable names character have been replaced with _ and variables may not start with a numeral, so one is prefixed in proc transpose output.

 

The variable list in proc corr

Var _:  ;

says to use all variables that start with _. If there happen to be any that are not numeric, or you don't want to include in analysis, then you need to explicitly list them all.

 

You said:

 I am creating an index which sums up the scores of particular element ID's

but I don't see anything related to sums of scores involved.

View solution in original post

4 REPLIES 4
ballardw
Super User

If you did not want a separate analysis for each occupation why did you include the BY statement? That forces analysis for each level of the combinations of variables on the BY statement.

Diana_AdventuresinSAS
Obsidian | Level 7

Thanks for your reply. Below's what I get without the BY statement. N should be the number of occupations. In a simple CORR procedure for alpha, you list a number of variables you want correlated. However, in this dataset, instead of each of these four element IDs being a variable, they are listed under one elementID variable.

 

I didn't feel like creating a new dataset separating them into individual variables (because I don't know how to), I hoped to use a program to get the alpha calculated. Do you think I have to create a new dataset outputting these into four distinct variables with each assigned the score value? If so, do you have any suggestions on how to do that? Thanks very much!!

 

SAS Output

The CORR Procedure
1 Variables: Score


Simple Statistics
Variable N Mean Std Dev Sum Minimum Maximum Label
Score 3896 4.08217 1.27012 15904 1.00000 7.00000 Score


Pearson Correlation Coefficients, N = 3896
Prob > |r| under H0: Rho=0
  Score
Score
Score
1.00000
 

 

 

ballardw
Super User

Your example data does not have anything to "correlate" with. For a correlation to mean anything there needs to be two variables.

If you do not provide a "with" variable list then the procedure calculates correlations between all of the variables on the VAR statement:

proc corr data=sashelp.class;
   var height weight age;
run;

provides correlations between: height and weight, height and age, and weight and age.

 

If I want to compare specific lists such as height and weight both to only age

proc corr data=sashelp.class;
   var height weight;
   with age;
run;

 

Note the output here

proc corr data=sashelp.class alpha;
   var height weight;
 
run;

You do want one variable for each "type"  of score. Proc transpose does that.

 

 

data have;
input OCCSOC :$11. ElementID $  ScaleID $ Score ;
datalines;
11-1011.00 1.B.2.a EX 6.33 
11-1011.00 1.B.2.b EX 6.33 
11-1011.00 1.B.2.c EX 7 
11-1011.00 1.B.2.d EX 5 
11-1011.00 1.B.2.e EX 5.33 
11-1011.00 1.B.2.f EX 7 
11-1011.00 1.B.2.g VH 3 
11-1011.00 1.B.2.h VH 6 
11-1011.00 1.B.2.i VH 1 
11-1011.03 1.B.2.a EX 6.67 
11-1011.03 1.B.2.b EX 6.33 
11-1011.03 1.B.2.c EX 6 
11-1011.03 1.B.2.d EX 5 
11-1011.03 1.B.2.e EX 3.33 
11-1011.03 1.B.2.f EX 6.67 
11-1011.03 1.B.2.g VH 1 
11-1011.03 1.B.2.h VH 6 
11-1011.03 1.B.2.i VH 2 
11-1021.00 1.B.2.a EX 5.33 
11-1021.00 1.B.2.b EX 6 
11-1021.00 1.B.2.c EX 5.67 
11-1021.00 1.B.2.d EX 6.33 
11-1021.00 1.B.2.e EX 4.67 
11-1021.00 1.B.2.f EX 6 
11-1021.00 1.B.2.g VH 4 
11-1021.00 1.B.2.h VH 6 
11-1021.00 1.B.2.i VH 2 
;
run;

proc transpose data=have (where=( elementid in ('1.B.2.a' '1.B.2.b' '1.B.2.c' '1.B.2.d'))) 
   out=havetrans (drop=_name_);
   by OCCSOC;
   var score;
   id elementid;
run; 

Proc corr data=havetrans ALPHA;
var _: ;

run; 

Because your actual values of elementid do not make valid SAS variable names character have been replaced with _ and variables may not start with a numeral, so one is prefixed in proc transpose output.

 

The variable list in proc corr

Var _:  ;

says to use all variables that start with _. If there happen to be any that are not numeric, or you don't want to include in analysis, then you need to explicitly list them all.

 

You said:

 I am creating an index which sums up the scores of particular element ID's

but I don't see anything related to sums of scores involved.

Diana_AdventuresinSAS
Obsidian | Level 7

Thank you! I did the proc transpose and was able to use the four variables to run a proc corr alpha.

sas-innovate-white.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.

 

Early bird rate extended! Save $200 when you sign up by March 31.

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2164 views
  • 1 like
  • 2 in conversation