BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Amanda_Lemon
Quartz | Level 8

Hello, 

 

I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).

 

ID  Time  TestScore Gender Race

1       1            5              1        2

1       2            7              1        2

2       1            3              2        4

2       2            8              2        4

3       1            1              1        3

3       2            4              1        3

 

Now, I want my data set to look like that: 

 

ID TestScore1 TestScore2 Gender Race

1           5                  7             1          2

2           3                  8             2          4

3           1                  4             1          3

 

I ran out of ideas to try... Is there an easy way to create a data set in this form?

 

Thank you in advance. 

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

@Amanda_Lemon Sorry for the little delay as  i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->

 

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3;
datalines;
1	1	0	1	22	50	70	50
1	2	0	1	22	60	75	45
2	1	1	3	31	35	60	30
2	2	1	3	31	80	65	60
3	1	1	5	18	25	85	50
3	2	1	5	18	45	90	40
4	1	1	1	42	10	20	80
4	2	1	1	42	60	70	50
5	1	0	2	21	50	60	60
5	2	0	2	21	55	40	80
6	1	1	2	36	70	40	15
6	2	1	2	36	80	45	30
;
proc transpose data=have out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

 

View solution in original post

20 REPLIES 20
Reeza
Super User

PROC TRANSPOSE? Use ID, Gender, Race as your BY variables. 

 


@Amanda_Lemon wrote:

Hello, 

 

I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).

 

ID  Time  TestScore Gender Race

1       1            5              1        2

1       2            7              1        2

2       1            3              2        4

2       2            8              2        4

3       1            1              1        3

3       2            4              1        3

 

Now, I want my data set to look like that: 

 

ID TestScore1 TestScore2 Gender Race

1           5                  7             1          2

2           3                  8             2          4

3           1                  4             1          3

 

I ran out of ideas to try... Is there an easy way to create a data set in this form?

 

Thank you in advance. 


 

Reeza
Super User

PROC TRANSPOSE? Use ID, Gender, Race as your BY variables. 

 


@Amanda_Lemon wrote:

Hello, 

 

I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).

 

ID  Time  TestScore Gender Race

1       1            5              1        2

1       2            7              1        2

2       1            3              2        4

2       2            8              2        4

3       1            1              1        3

3       2            4              1        3

 

Now, I want my data set to look like that: 

 

ID TestScore1 TestScore2 Gender Race

1           5                  7             1          2

2           3                  8             2          4

3           1                  4             1          3

 

I ran out of ideas to try... Is there an easy way to create a data set in this form?

 

Thank you in advance. 


 

novinosrin
Tourmaline | Level 20
data have;
input ID  Time  TestScore Gender Race;
cards;
1       1            5              1        2
1       2            7              1        2
2       1            3              2        4
2       2            8              2        4
3       1            1              1        3
3       2            4              1        3
;
run;

proc transpose data=have out=want(drop=_:) prefix=testscore;
by id gender race;
var TestScore;
run;
Amanda_Lemon
Quartz | Level 8

So, this code works only if I have 1 test. But now I have many tests... So, I thought maybe I can do this code for all of my test variables separately and then just merge the data sets. 

 

Here is what I came up with so far. The problem is that I don't know how to assign names to the "want" data set so that each newly created want has its unique name, e.g., want_Test1, want_Test2, etc. (So that I can merge them at the end by the ID and demographic variables). The same problem with COL1 and COL2 that are in each want data set -- I need to rename COL1 into Test1_1 and COL2 into Test1_2 in the first data set, COL1 into Test2_1 and COL2 into Test2_2 in the second data set, etc. Could someone please help me with that?

 

data varset_test;
length test_name $40;
do test_name = 'Test1', 'Test2', 'Test3';
     output;
end;
run;
%macro trans(test=);
proc transpose data=have out=want(drop=_:);
by ID Race Gender Age;
var &test;
run;
%mend trans;
data new;
set varset_test;
call execute('%trans(test='||strip(test_name)||');');
run; 

Thank you in advance. 

 

P.S. Maybe there is an easier way to do transposing for multiple test variables than what I am trying to do? 

novinosrin
Tourmaline | Level 20

Can you please provide us a full comprehensive sample of your dataset and an output sample to that input sample that is great representative plz? This would help in focusing on getting to develop a solution in one go and in full. 

Amanda_Lemon
Quartz | Level 8

Sure! So, the first table below is what I have: People were tested twice, and for each testing time there is a separate row for each person. Test scores are different between rows within a person but demographic variables are the same. 

 

IDTimeGenderRaceAgeTest1Test2Test3
110122507050
120122607545
211331356030
221331806560
311518258550
321518459040
411142102080
421142607050
510221506060
520221554080
611236704015
621236804530

 

The second table is what I want: to have one row per person and to have two variables for each test -- one variable for the test at Time 1 and another variable for the test at Time 2. And like that for each test.

 

IDGenderRaceAgeTest1_1Test1_2Test2_1Test2_2Test3_1Test3_2
10122506070755045
21331358060653060
31518254585905040
41142106020708050
50221505560406080
61236708040451530
novinosrin
Tourmaline | Level 20

@Amanda_Lemon Sorry for the little delay as  i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->

 

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3;
datalines;
1	1	0	1	22	50	70	50
1	2	0	1	22	60	75	45
2	1	1	3	31	35	60	30
2	2	1	3	31	80	65	60
3	1	1	5	18	25	85	50
3	2	1	5	18	45	90	40
4	1	1	1	42	10	20	80
4	2	1	1	42	60	70	50
5	1	0	2	21	50	60	60
5	2	0	2	21	55	40	80
6	1	1	2	36	70	40	15
6	2	1	2	36	80	45	30
;
proc transpose data=have out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

 

Amanda_Lemon
Quartz | Level 8
Oh WOW! So easy! Thank you so much!!!
novinosrin
Tourmaline | Level 20

@Amanda_Lemon There you go, the more clearer the INPUT sample and the WANTED  output sample, it makes it easy for all of us. If you are satisfied with answer, may i request to mark the answer that you accept and close the thread plz. Thank you!

Amanda_Lemon
Quartz | Level 8

Ran into a problem... It appears that if at least one of my test variables is a character variable, then the output of proc transpose for ALL test variables would be character. Is there any way to prevent that from happening? I need my numeric test variables to continue being numeric in the output. Below is the kind of input data set that I am talking about: 

 

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3 Test4 $;
datalines;
1	1	0	1	22	50	70	50  A
1	2	0	1	22	60	75	45  B
2	1	1	3	31	35	60	30  A
2	2	1	3	31	80	65	60  C
3	1	1	5	18	25	85	50  C
3	2	1	5	18	45	90	40  A
4	1	1	1	42	10	20	80  C
4	2	1	1	42	60	70	50  D
5	1	0	2	21	50	60	60  F
5	2	0	2	21	55	40	80  B
6	1	1	2	36	70	40	15  F
6	2	1	2	36	80	45	30  C
;
proc transpose data=have out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;  

 

novinosrin
Tourmaline | Level 20

Hi @Amanda_Lemon  Can i respond to you soon as get to my college lab tomorrow morning chicago time. I'm afraid i dont have SAS software at home and it is 10:25pm here. Meanwhile, Please try and make sure you provide us the best sample you can.

Good night for now and will look into it first thing in the morning. 

Amanda_Lemon
Quartz | Level 8
Absolutely! Thank you for your help!

The sample above is quite representative, I think. The problem is in adding character variables -- without them, numeric variables transpose fine, but with them, numeric variables become character...
novinosrin
Tourmaline | Level 20

@Amanda_Lemon  Good morning, Please find the below fix 

 

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3 Test4 $;
datalines;
1	1	0	1	22	50	70	50  A
1	2	0	1	22	60	75	45  B
2	1	1	3	31	35	60	30  A
2	2	1	3	31	80	65	60  C
3	1	1	5	18	25	85	50  C
3	2	1	5	18	45	90	40  A
4	1	1	1	42	10	20	80  C
4	2	1	1	42	60	70	50  D
5	1	0	2	21	50	60	60  F
5	2	0	2	21	55	40	80  B
6	1	1	2	36	70	40	15  F
6	2	1	2	36	80	45	30  C
;
/*separate the needed the vars by read and splitting into 2 sets chars and nums with by group vars*/
data chars(keep=id  gender race age time _char_) nums(keep=id  gender race age time _numeric_);
set have(keep=id  gender race age time test:);
run;
/*parse the nums */
proc transpose data=nums out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want1(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;
/*parse the chars */
proc transpose data=chars out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want2(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;
/*combine the tranposed chars and nums into one*/
data final_want;
merge want1 want2;
run;



 

 

Reeza
Super User

http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset

 

The macro on this page will do what you want, and it's pre-packaged. Worth learning how to use. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 20 replies
  • 3483 views
  • 3 likes
  • 3 in conversation