Hello,
I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).
ID Time TestScore Gender Race
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
Now, I want my data set to look like that:
ID TestScore1 TestScore2 Gender Race
1 5 7 1 2
2 3 8 2 4
3 1 4 1 3
I ran out of ideas to try... Is there an easy way to create a data set in this form?
Thank you in advance.
@Amanda_Lemon Sorry for the little delay as i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->
data have;
input ID Time Gender Race Age Test1 Test2 Test3;
datalines;
1 1 0 1 22 50 70 50
1 2 0 1 22 60 75 45
2 1 1 3 31 35 60 30
2 2 1 3 31 80 65 60
3 1 1 5 18 25 85 50
3 2 1 5 18 45 90 40
4 1 1 1 42 10 20 80
4 2 1 1 42 60 70 50
5 1 0 2 21 50 60 60
5 2 0 2 21 55 40 80
6 1 1 2 36 70 40 15
6 2 1 2 36 80 45 30
;
proc transpose data=have out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
PROC TRANSPOSE? Use ID, Gender, Race as your BY variables.
@Amanda_Lemon wrote:
Hello,
I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).
ID Time TestScore Gender Race
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
Now, I want my data set to look like that:
ID TestScore1 TestScore2 Gender Race
1 5 7 1 2
2 3 8 2 4
3 1 4 1 3
I ran out of ideas to try... Is there an easy way to create a data set in this form?
Thank you in advance.
PROC TRANSPOSE? Use ID, Gender, Race as your BY variables.
@Amanda_Lemon wrote:
Hello,
I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).
ID Time TestScore Gender Race
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
Now, I want my data set to look like that:
ID TestScore1 TestScore2 Gender Race
1 5 7 1 2
2 3 8 2 4
3 1 4 1 3
I ran out of ideas to try... Is there an easy way to create a data set in this form?
Thank you in advance.
data have;
input ID Time TestScore Gender Race;
cards;
1 1 5 1 2
1 2 7 1 2
2 1 3 2 4
2 2 8 2 4
3 1 1 1 3
3 2 4 1 3
;
run;
proc transpose data=have out=want(drop=_:) prefix=testscore;
by id gender race;
var TestScore;
run;
So, this code works only if I have 1 test. But now I have many tests... So, I thought maybe I can do this code for all of my test variables separately and then just merge the data sets.
Here is what I came up with so far. The problem is that I don't know how to assign names to the "want" data set so that each newly created want has its unique name, e.g., want_Test1, want_Test2, etc. (So that I can merge them at the end by the ID and demographic variables). The same problem with COL1 and COL2 that are in each want data set -- I need to rename COL1 into Test1_1 and COL2 into Test1_2 in the first data set, COL1 into Test2_1 and COL2 into Test2_2 in the second data set, etc. Could someone please help me with that?
data varset_test;
length test_name $40;
do test_name = 'Test1', 'Test2', 'Test3';
output;
end;
run;
%macro trans(test=);
proc transpose data=have out=want(drop=_:);
by ID Race Gender Age;
var &test;
run;
%mend trans;
data new;
set varset_test;
call execute('%trans(test='||strip(test_name)||');');
run;
Thank you in advance.
P.S. Maybe there is an easier way to do transposing for multiple test variables than what I am trying to do?
Can you please provide us a full comprehensive sample of your dataset and an output sample to that input sample that is great representative plz? This would help in focusing on getting to develop a solution in one go and in full.
Sure! So, the first table below is what I have: People were tested twice, and for each testing time there is a separate row for each person. Test scores are different between rows within a person but demographic variables are the same.
ID | Time | Gender | Race | Age | Test1 | Test2 | Test3 |
1 | 1 | 0 | 1 | 22 | 50 | 70 | 50 |
1 | 2 | 0 | 1 | 22 | 60 | 75 | 45 |
2 | 1 | 1 | 3 | 31 | 35 | 60 | 30 |
2 | 2 | 1 | 3 | 31 | 80 | 65 | 60 |
3 | 1 | 1 | 5 | 18 | 25 | 85 | 50 |
3 | 2 | 1 | 5 | 18 | 45 | 90 | 40 |
4 | 1 | 1 | 1 | 42 | 10 | 20 | 80 |
4 | 2 | 1 | 1 | 42 | 60 | 70 | 50 |
5 | 1 | 0 | 2 | 21 | 50 | 60 | 60 |
5 | 2 | 0 | 2 | 21 | 55 | 40 | 80 |
6 | 1 | 1 | 2 | 36 | 70 | 40 | 15 |
6 | 2 | 1 | 2 | 36 | 80 | 45 | 30 |
The second table is what I want: to have one row per person and to have two variables for each test -- one variable for the test at Time 1 and another variable for the test at Time 2. And like that for each test.
ID | Gender | Race | Age | Test1_1 | Test1_2 | Test2_1 | Test2_2 | Test3_1 | Test3_2 |
1 | 0 | 1 | 22 | 50 | 60 | 70 | 75 | 50 | 45 |
2 | 1 | 3 | 31 | 35 | 80 | 60 | 65 | 30 | 60 |
3 | 1 | 5 | 18 | 25 | 45 | 85 | 90 | 50 | 40 |
4 | 1 | 1 | 42 | 10 | 60 | 20 | 70 | 80 | 50 |
5 | 0 | 2 | 21 | 50 | 55 | 60 | 40 | 60 | 80 |
6 | 1 | 2 | 36 | 70 | 80 | 40 | 45 | 15 | 30 |
@Amanda_Lemon Sorry for the little delay as i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->
data have;
input ID Time Gender Race Age Test1 Test2 Test3;
datalines;
1 1 0 1 22 50 70 50
1 2 0 1 22 60 75 45
2 1 1 3 31 35 60 30
2 2 1 3 31 80 65 60
3 1 1 5 18 25 85 50
3 2 1 5 18 45 90 40
4 1 1 1 42 10 20 80
4 2 1 1 42 60 70 50
5 1 0 2 21 50 60 60
5 2 0 2 21 55 40 80
6 1 1 2 36 70 40 15
6 2 1 2 36 80 45 30
;
proc transpose data=have out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
@Amanda_Lemon There you go, the more clearer the INPUT sample and the WANTED output sample, it makes it easy for all of us. If you are satisfied with answer, may i request to mark the answer that you accept and close the thread plz. Thank you!
Ran into a problem... It appears that if at least one of my test variables is a character variable, then the output of proc transpose for ALL test variables would be character. Is there any way to prevent that from happening? I need my numeric test variables to continue being numeric in the output. Below is the kind of input data set that I am talking about:
data have;
input ID Time Gender Race Age Test1 Test2 Test3 Test4 $;
datalines;
1 1 0 1 22 50 70 50 A
1 2 0 1 22 60 75 45 B
2 1 1 3 31 35 60 30 A
2 2 1 3 31 80 65 60 C
3 1 1 5 18 25 85 50 C
3 2 1 5 18 45 90 40 A
4 1 1 1 42 10 20 80 C
4 2 1 1 42 60 70 50 D
5 1 0 2 21 50 60 60 F
5 2 0 2 21 55 40 80 B
6 1 1 2 36 70 40 15 F
6 2 1 2 36 80 45 30 C
;
proc transpose data=have out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
Hi @Amanda_Lemon Can i respond to you soon as get to my college lab tomorrow morning chicago time. I'm afraid i dont have SAS software at home and it is 10:25pm here. Meanwhile, Please try and make sure you provide us the best sample you can.
Good night for now and will look into it first thing in the morning.
@Amanda_Lemon Good morning, Please find the below fix
data have;
input ID Time Gender Race Age Test1 Test2 Test3 Test4 $;
datalines;
1 1 0 1 22 50 70 50 A
1 2 0 1 22 60 75 45 B
2 1 1 3 31 35 60 30 A
2 2 1 3 31 80 65 60 C
3 1 1 5 18 25 85 50 C
3 2 1 5 18 45 90 40 A
4 1 1 1 42 10 20 80 C
4 2 1 1 42 60 70 50 D
5 1 0 2 21 50 60 60 F
5 2 0 2 21 55 40 80 B
6 1 1 2 36 70 40 15 F
6 2 1 2 36 80 45 30 C
;
/*separate the needed the vars by read and splitting into 2 sets chars and nums with by group vars*/
data chars(keep=id gender race age time _char_) nums(keep=id gender race age time _numeric_);
set have(keep=id gender race age time test:);
run;
/*parse the nums */
proc transpose data=nums out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want1(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
/*parse the chars */
proc transpose data=chars out=temp;
by id gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want2(drop=_name_) delim=_;
by id gender race age ;
var col1;
id _name_ time;
run;
/*combine the tranposed chars and nums into one*/
data final_want;
merge want1 want2;
run;
http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset
The macro on this page will do what you want, and it's pre-packaged. Worth learning how to use.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.