Hello,
I am looking for assistance in creating code for a new table in SAS. I want to do a spearman correlation for scores of two different variables that are listed in the same column.
Group | Variable | Score |
1 | Teamwork | 3 |
2 | Teamwork | 4 |
3 | Teamwork | 5 |
1 | Communication | 4 |
2 | Communication | 3 |
3 | Communication | 5 |
So I was thinking I need to make a new table like this:
Group | Teamwork | Communication |
1 | 3 | 3 |
2 | 4 | 4 |
3 | 5 | 5 |
1 | 4 | 4 |
2 | 3 | 3 |
3 | 5 | 5 |
Is that correct? How else can I separate these when they are listed in the same column…?
TIA
Well, as far as I can tell, you only have 3 scores for Teamwork and 3 scores for Communication. Therefore you should only have three rows in your final output, yes? Like the below:
Here's some sample code, below, that will work generically with no matter how many values of Variable you have. Right now, you have "Communication" and "Teamwork". If, say, you added a couple more, for example "Innovation" and "Accuracy", this could should still work.
DATA Have;
LENGTH Group $1
Variable $13
Score $1
;
INFILE DATALINES DSD DLM='09'X MISSOVER;
INPUT Group $
Variable $
Score $
;
DATALINES;
1 Teamwork 3
2 Teamwork 4
3 Teamwork 5
1 Communication 4
2 Communication 3
3 Communication 5
;
RUN;
PROC SQL ;
SELECT COUNT (DISTINCT Variable)
INTO :Nbr_of_Vars
FROM Have
;
SELECT DISTINCT Variable
INTO :Var1 -
FROM Have
;
QUIT;
PROC SORT DATA=Have;
BY Group;
RUN;
DATA Want;
DROP Variable Score;
SET Have;
BY Group;
%MACRO Generate_Var_Code;
%LOCAL i;
%DO i = 1 %TO &Nbr_of_Vars;
RETAIN &&Var&i;
%END;
%DO i = 1 %TO &Nbr_of_Vars;
IF Variable = "&&Var&i" THEN
&&Var&i = Score;
ELSE
%END;
PUTLOG "ERROR: Invalid Var name " Variable=;
%MEND Generate_Var_Code;
%Generate_Var_Code;
IF LAST.Group THEN
OUTPUT;
RUN;
Jim
Hi Jim
Thank you for your help. It is unfortunately more complicated. The variable column has 96 different options, two of which are teamwork and communication. The scores are from 1-5 and there are many different teams that are evaluated. So in total there are over 43,000 rows. A sample of these data and the excel file is below:
Date | Room | Group | Year | obemerg | med_team | variable | group_id $ | Score | bad_action | Day | exclude_med | exclude_team | exclude_perf |
11/13/2014 | 1 | 1 | 2014 | 1 | 1 | 70 | 1A41956 | 1 | 0 | 0 | 0 | 1 | 0 |
11/13/2014 | 2 | 1 | 2014 | 1 | 1 | 70 | 2A41956 | 3 | 0 | 0 | 0 | 0 | 0 |
11/13/2014 | 3 | 1 | 2014 | 1 | 1 | 70 | 3A41956 | 2 | 0 | 0 | 0 | 0 | 0 |
11/13/2014 | 4 | 1 | 2014 | 1 | 1 | 70 | 4A41956 | 2 | 1 | 0 | 0 | 0 | 0 |
11/13/2014 | 5 | 1 | 2014 | 1 | 1 | 70 | 5A41956 | 3 | 0 | 0 | 0 | 0 | 0 |
So I want to create columns of the group ID with scores (1-5) for each component from variable (96 total). Is that possible?
Thanks!
Sarah
Hi Jim,
I was hoping to compare the scores from two different variables to evaluate whether there is a correlation. I figured I would need to create columns for each variable right?
If obemerg=1;
Group_ID | Teamwork score | Communication score | Variable 3……97 |
1A41956 | 3 | 1 | 1 |
1A41957 | 4 | 2 | 2 |
1A41958 | 5 | 5 | 5 |
1A41959 | 2 | 4 | 4 |
1A41960 | 5 | 3 | 3 |
1A41961 | 1 | 5 | 5 |
As of right now, the scores for teamwork and communication and every variable are all in the same column...
Thank you,
Sarah
@sdetlefs wrote:
I was hoping to compare the scores from two different variables to evaluate whether there is a correlation. I figured I would need to create columns for each variable right?
Hi, Sarah,
Yes, you would need to create a column for each value of Variable. This code should work for that if all you need is a) a column for each value of Variable and b) the score. If you need to include additional columns, those would have to be added.
However, there is one little problem. The column "Variable" in your sample data has a numeric value, "70". SAS generally doesn't want variables to start with a number. You could try name literals, but I think it might be better to prefix each numeric value with a "G" (or some letter) so that valid SAS V7 column names can be created. The "G" can be stripped off before creating a report if need be.
And do you want this by Group or Group_ID? Your last post makes me think that you want this by Group_ID not Group.
Jim
Hi everyone,
I ended up transposing all of the data in SAS (hours of work) and my database is now working. I am sure that there is a way in SAS but this ended up being easier in the short run. Thank you Jim for telling me that the variables shouldn't be numeric, I would have struggled to figure this out for a long time otherwise.
Thank you for your assistance,
Sarah
PROC TRANSPOSE should work fine for your case. If your naming variable has values that don't start with letters just use the PREFIX= option.
proc transpose data=have out=want prefix=var ;
by <grouping variables>;
id <naming variable(s)>;
var <value variable>;
run;
It should not take very long to run.
Do you mean you want this?
Group Teamwork Communication 1 3 4 2 4 3 3 5 5
If so, here is code that should work (untested). If that's not what you want, much more explanation is needed
data want;
merge have(where=(variable='Teamwork') rename=(score=teamwork))
have(where=(variable='Communication') rename=(score=communication));
by group;
run;
Hi Paige,
Unfortunately this did not work. I think that is what I am trying to do, though. At the end it stated that despite the code variables teamwork and total were not found.
I posted a sample of the database to explain the complexity and what I am trying to do in previous reply.
Thank you,
Sarah
Just sort and transpose:
data have;
infile datalines dlm=" " dsd truncover;
input group $ variable :$20. score;
datalines;
1 Teamwork 3
2 Teamwork 4
3 Teamwork 5
1 Communication 4
2 Communication 3
3 Communication 5
;
proc sort data=have;
by group;
run;
proc transpose
data=have
out=want (drop=_name_)
;
by group;
id variable;
var score;
run;
If your issue is more complex than this, post your more complicated example data in similar form (data step with datalines), and show us the expected result.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.