Hello,
I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).
ID Time TestScore Gender Race
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
Now, I want my data set to look like that:
ID TestScore1 TestScore2 Gender Race
1 5 7 1 2
2 3 8 2 4
3 1 4 1 3
I ran out of ideas to try... Is there an easy way to create a data set in this form?
Thank you in advance.
@Amanda_Lemon Sorry for the little delay as i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->
data have;
input ID Time Gender Race Age Test1 Test2 Test3;
datalines;
1 1 0 1 22 50 70 50
1 2 0 1 22 60 75 45
2 1 1 3 31 35 60 30
2 2 1 3 31 80 65 60
3 1 1 5 18 25 85 50
3 2 1 5 18 45 90 40
4 1 1 1 42 10 20 80
4 2 1 1 42 60 70 50
5 1 0 2 21 50 60 60
5 2 0 2 21 55 40 80
6 1 1 2 36 70 40 15
6 2 1 2 36 80 45 30
;
proc transpose data=have out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
PROC TRANSPOSE? Use ID, Gender, Race as your BY variables.
@Amanda_Lemon wrote:
Hello,
I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).
ID Time TestScore Gender Race
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
Now, I want my data set to look like that:
ID TestScore1 TestScore2 Gender Race
1 5 7 1 2
2 3 8 2 4
3 1 4 1 3
I ran out of ideas to try... Is there an easy way to create a data set in this form?
Thank you in advance.
PROC TRANSPOSE? Use ID, Gender, Race as your BY variables.
@Amanda_Lemon wrote:
Hello,
I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).
ID Time TestScore Gender Race
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
Now, I want my data set to look like that:
ID TestScore1 TestScore2 Gender Race
1 5 7 1 2
2 3 8 2 4
3 1 4 1 3
I ran out of ideas to try... Is there an easy way to create a data set in this form?
Thank you in advance.
data have;
input ID Time TestScore Gender Race;
cards;
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
;
run;
proc transpose data=have out=want(drop=_:) prefix=testscore;
by id gender race;
var TestScore;
run;
So, this code works only if I have 1 test. But now I have many tests... So, I thought maybe I can do this code for all of my test variables separately and then just merge the data sets.
Here is what I came up with so far. The problem is that I don't know how to assign names to the "want" data set so that each newly created want has its unique name, e.g., want_Test1, want_Test2, etc. (So that I can merge them at the end by the ID and demographic variables). The same problem with COL1 and COL2 that are in each want data set -- I need to rename COL1 into Test1_1 and COL2 into Test1_2 in the first data set, COL1 into Test2_1 and COL2 into Test2_2 in the second data set, etc. Could someone please help me with that?
data varset_test;
length test_name $40;
do test_name = 'Test1', 'Test2', 'Test3';
output;
end;
run;
%macro trans(test=);
proc transpose data=have out=want(drop=_:);
by ID Race Gender Age;
var &test;
run;
%mend trans;
data new;
set varset_test;
call execute('%trans(test='||strip(test_name)||');');
run;
Thank you in advance.
P.S. Maybe there is an easier way to do transposing for multiple test variables than what I am trying to do?
Can you please provide us a full comprehensive sample of your dataset and an output sample to that input sample that is great representative plz? This would help in focusing on getting to develop a solution in one go and in full.
Sure! So, the first table below is what I have: People were tested twice, and for each testing time there is a separate row for each person. Test scores are different between rows within a person but demographic variables are the same.
ID | Time | Gender | Race | Age | Test1 | Test2 | Test3 |
1 | 1 | 0 | 1 | 22 | 50 | 70 | 50 |
1 | 2 | 0 | 1 | 22 | 60 | 75 | 45 |
2 | 1 | 1 | 3 | 31 | 35 | 60 | 30 |
2 | 2 | 1 | 3 | 31 | 80 | 65 | 60 |
3 | 1 | 1 | 5 | 18 | 25 | 85 | 50 |
3 | 2 | 1 | 5 | 18 | 45 | 90 | 40 |
4 | 1 | 1 | 1 | 42 | 10 | 20 | 80 |
4 | 2 | 1 | 1 | 42 | 60 | 70 | 50 |
5 | 1 | 0 | 2 | 21 | 50 | 60 | 60 |
5 | 2 | 0 | 2 | 21 | 55 | 40 | 80 |
6 | 1 | 1 | 2 | 36 | 70 | 40 | 15 |
6 | 2 | 1 | 2 | 36 | 80 | 45 | 30 |
The second table is what I want: to have one row per person and to have two variables for each test -- one variable for the test at Time 1 and another variable for the test at Time 2. And like that for each test.
ID | Gender | Race | Age | Test1_1 | Test1_2 | Test2_1 | Test2_2 | Test3_1 | Test3_2 |
1 | 0 | 1 | 22 | 50 | 60 | 70 | 75 | 50 | 45 |
2 | 1 | 3 | 31 | 35 | 80 | 60 | 65 | 30 | 60 |
3 | 1 | 5 | 18 | 25 | 45 | 85 | 90 | 50 | 40 |
4 | 1 | 1 | 42 | 10 | 60 | 20 | 70 | 80 | 50 |
5 | 0 | 2 | 21 | 50 | 55 | 60 | 40 | 60 | 80 |
6 | 1 | 2 | 36 | 70 | 80 | 40 | 45 | 15 | 30 |
@Amanda_Lemon Sorry for the little delay as i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->
data have;
input ID Time Gender Race Age Test1 Test2 Test3;
datalines;
1 1 0 1 22 50 70 50
1 2 0 1 22 60 75 45
2 1 1 3 31 35 60 30
2 2 1 3 31 80 65 60
3 1 1 5 18 25 85 50
3 2 1 5 18 45 90 40
4 1 1 1 42 10 20 80
4 2 1 1 42 60 70 50
5 1 0 2 21 50 60 60
5 2 0 2 21 55 40 80
6 1 1 2 36 70 40 15
6 2 1 2 36 80 45 30
;
proc transpose data=have out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
@Amanda_Lemon There you go, the more clearer the INPUT sample and the WANTED output sample, it makes it easy for all of us. If you are satisfied with answer, may i request to mark the answer that you accept and close the thread plz. Thank you!
Ran into a problem... It appears that if at least one of my test variables is a character variable, then the output of proc transpose for ALL test variables would be character. Is there any way to prevent that from happening? I need my numeric test variables to continue being numeric in the output. Below is the kind of input data set that I am talking about:
data have;
input ID Time Gender Race Age Test1 Test2 Test3 Test4 $;
datalines;
1 1 0 1 22 50 70 50 A
1 2 0 1 22 60 75 45 B
2 1 1 3 31 35 60 30 A
2 2 1 3 31 80 65 60 C
3 1 1 5 18 25 85 50 C
3 2 1 5 18 45 90 40 A
4 1 1 1 42 10 20 80 C
4 2 1 1 42 60 70 50 D
5 1 0 2 21 50 60 60 F
5 2 0 2 21 55 40 80 B
6 1 1 2 36 70 40 15 F
6 2 1 2 36 80 45 30 C
;
proc transpose data=have out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
Hi @Amanda_Lemon Can i respond to you soon as get to my college lab tomorrow morning chicago time. I'm afraid i dont have SAS software at home and it is 10:25pm here. Meanwhile, Please try and make sure you provide us the best sample you can.
Good night for now and will look into it first thing in the morning.
@Amanda_Lemon Good morning, Please find the below fix
data have;
input ID Time Gender Race Age Test1 Test2 Test3 Test4 $;
datalines;
1 1 0 1 22 50 70 50 A
1 2 0 1 22 60 75 45 B
2 1 1 3 31 35 60 30 A
2 2 1 3 31 80 65 60 C
3 1 1 5 18 25 85 50 C
3 2 1 5 18 45 90 40 A
4 1 1 1 42 10 20 80 C
4 2 1 1 42 60 70 50 D
5 1 0 2 21 50 60 60 F
5 2 0 2 21 55 40 80 B
6 1 1 2 36 70 40 15 F
6 2 1 2 36 80 45 30 C
;
/*separate the needed the vars by read and splitting into 2 sets chars and nums with by group vars*/
data chars(keep=id gender race age time _char_) nums(keep=id gender race age time _numeric_);
set have(keep=id gender race age time test:);
run;
/*parse the nums */
proc transpose data=nums out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want1(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
/*parse the chars */
proc transpose data=chars out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want2(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
/*combine the tranposed chars and nums into one*/
data final_want;
merge want1 want2;
run;
http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
The macro on this page will do what you want, and it's pre-packaged. Worth learning how to use.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.