Solved: Re: Creating columns out of rows

Amanda_Lemon · Posted 06-20-2018 04:19 PM

Hello,

I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).

ID Time TestScore Gender Race

1 1 5 1 2

1 2 7 1 2

2 1 3 2 4

2 2 8 2 4

3 1 1 1 3

3 2 4 1 3

Now, I want my data set to look like that:

ID TestScore1 TestScore2 Gender Race

1 5 7 1 2

2 3 8 2 4

3 1 4 1 3

I ran out of ideas to try... Is there an easy way to create a data set in this form?

Thank you in advance.

novinosrin · Posted 07-03-2018 10:50 AM

@Amanda_Lemon Sorry for the little delay as i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3;
datalines;
1	1	0	1	22	50	70	50
1	2	0	1	22	60	75	45
2	1	1	3	31	35	60	30
2	2	1	3	31	80	65	60
3	1	1	5	18	25	85	50
3	2	1	5	18	45	90	40
4	1	1	1	42	10	20	80
4	2	1	1	42	60	70	50
5	1	0	2	21	50	60	60
5	2	0	2	21	55	40	80
6	1	1	2	36	70	40	15
6	2	1	2	36	80	45	30
;
proc transpose data=have out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

View solution in original post

Reeza · Posted 06-20-2018 04:26 PM

PROC TRANSPOSE? Use ID, Gender, Race as your BY variables.

@Amanda_Lemon wrote:

Hello,

I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).

ID Time TestScore Gender Race

1 1 5 1 2

1 2 7 1 2

2 1 3 2 4

2 2 8 2 4

3 1 1 1 3

3 2 4 1 3

Now, I want my data set to look like that:

ID TestScore1 TestScore2 Gender Race

1 5 7 1 2

2 3 8 2 4

3 1 4 1 3

I ran out of ideas to try... Is there an easy way to create a data set in this form?

Thank you in advance.

Reeza · Posted 06-20-2018 04:26 PM

PROC TRANSPOSE? Use ID, Gender, Race as your BY variables.

@Amanda_Lemon wrote:

Hello,

I have a data set that looks like this: each person (ID) was tested twice: at Time 1 and at Time 2 (so there are two rows for each ID). Each person also has its gender and race (these values will not differ for the same person).

ID Time TestScore Gender Race

1 1 5 1 2

1 2 7 1 2

2 1 3 2 4

2 2 8 2 4

3 1 1 1 3

3 2 4 1 3

Now, I want my data set to look like that:

ID TestScore1 TestScore2 Gender Race

1 5 7 1 2

2 3 8 2 4

3 1 4 1 3

I ran out of ideas to try... Is there an easy way to create a data set in this form?

Thank you in advance.

novinosrin · Posted 06-20-2018 04:31 PM

data have;
input ID  Time  TestScore Gender Race;
cards;
1       1            5              1        2
1       2            7              1        2
2       1            3              2        4
2       2            8              2        4
3       1            1              1        3
3       2            4              1        3
;
run;

proc transpose data=have out=want(drop=_:) prefix=testscore;
by id gender race;
var TestScore;
run;

Amanda_Lemon · Posted 07-03-2018 09:45 AM

So, this code works only if I have 1 test. But now I have many tests... So, I thought maybe I can do this code for all of my test variables separately and then just merge the data sets.

Here is what I came up with so far. The problem is that I don't know how to assign names to the "want" data set so that each newly created want has its unique name, e.g., want_Test1, want_Test2, etc. (So that I can merge them at the end by the ID and demographic variables). The same problem with COL1 and COL2 that are in each want data set -- I need to rename COL1 into Test1_1 and COL2 into Test1_2 in the first data set, COL1 into Test2_1 and COL2 into Test2_2 in the second data set, etc. Could someone please help me with that?

data varset_test;
length test_name $40;
do test_name = 'Test1', 'Test2', 'Test3';
     output;
end;
run;
%macro trans(test=);
proc transpose data=have out=want(drop=_:);
by ID Race Gender Age;
var &test;
run;
%mend trans;
data new;
set varset_test;
call execute('%trans(test='||strip(test_name)||');');
run;

Thank you in advance.

P.S. Maybe there is an easier way to do transposing for multiple test variables than what I am trying to do?

novinosrin · Posted 07-03-2018 10:03 AM

Can you please provide us a full comprehensive sample of your dataset and an output sample to that input sample that is great representative plz? This would help in focusing on getting to develop a solution in one go and in full.

Amanda_Lemon · Posted 07-03-2018 10:19 AM

Sure! So, the first table below is what I have: People were tested twice, and for each testing time there is a separate row for each person. Test scores are different between rows within a person but demographic variables are the same.

ID	Time	Gender	Race	Age	Test1	Test2	Test3
1	1	0	1	22	50	70	50
1	2	0	1	22	60	75	45
2	1	1	3	31	35	60	30
2	2	1	3	31	80	65	60
3	1	1	5	18	25	85	50
3	2	1	5	18	45	90	40
4	1	1	1	42	10	20	80
4	2	1	1	42	60	70	50
5	1	0	2	21	50	60	60
5	2	0	2	21	55	40	80
6	1	1	2	36	70	40	15
6	2	1	2	36	80	45	30

The second table is what I want: to have one row per person and to have two variables for each test -- one variable for the test at Time 1 and another variable for the test at Time 2. And like that for each test.

ID	Gender	Race	Age	Test1_1	Test1_2	Test2_1	Test2_2	Test3_1	Test3_2
1	0	1	22	50	60	70	75	50	45
2	1	3	31	35	80	60	65	30	60
3	1	5	18	25	45	85	90	50	40
4	1	1	42	10	60	20	70	80	50
5	0	2	21	50	55	60	40	60	80
6	1	2	36	70	80	40	45	15	30

novinosrin · Posted 07-03-2018 10:50 AM

@Amanda_Lemon Sorry for the little delay as i was away for my morning coffee. Hmm Thank you for better sample and clear requirement. Very straight forward indeed with a double transpose-->

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3;
datalines;
1	1	0	1	22	50	70	50
1	2	0	1	22	60	75	45
2	1	1	3	31	35	60	30
2	2	1	3	31	80	65	60
3	1	1	5	18	25	85	50
3	2	1	5	18	45	90	40
4	1	1	1	42	10	20	80
4	2	1	1	42	60	70	50
5	1	0	2	21	50	60	60
5	2	0	2	21	55	40	80
6	1	1	2	36	70	40	15
6	2	1	2	36	80	45	30
;
proc transpose data=have out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

Amanda_Lemon · Posted 07-03-2018 11:00 AM

Oh WOW! So easy! Thank you so much!!!

novinosrin · Posted 07-03-2018 11:03 AM

@Amanda_Lemon There you go, the more clearer the INPUT sample and the WANTED output sample, it makes it easy for all of us. If you are satisfied with answer, may i request to mark the answer that you accept and close the thread plz. Thank you!

Amanda_Lemon · Posted 07-04-2018 11:20 PM

Ran into a problem... It appears that if at least one of my test variables is a character variable, then the output of proc transpose for ALL test variables would be character. Is there any way to prevent that from happening? I need my numeric test variables to continue being numeric in the output. Below is the kind of input data set that I am talking about:

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3 Test4 $;
datalines;
1	1	0	1	22	50	70	50  A
1	2	0	1	22	60	75	45  B
2	1	1	3	31	35	60	30  A
2	2	1	3	31	80	65	60  C
3	1	1	5	18	25	85	50  C
3	2	1	5	18	45	90	40  A
4	1	1	1	42	10	20	80  C
4	2	1	1	42	60	70	50  D
5	1	0	2	21	50	60	60  F
5	2	0	2	21	55	40	80  B
6	1	1	2	36	70	40	15  F
6	2	1	2	36	80	45	30  C
;
proc transpose data=have out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

proc transpose data=temp out=want(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;

novinosrin · Posted 07-04-2018 11:26 PM

Hi @Amanda_Lemon Can i respond to you soon as get to my college lab tomorrow morning chicago time. I'm afraid i dont have SAS software at home and it is 10:25pm here. Meanwhile, Please try and make sure you provide us the best sample you can.

Good night for now and will look into it first thing in the morning.

Amanda_Lemon · Posted 07-05-2018 08:31 AM

Absolutely! Thank you for your help!

The sample above is quite representative, I think. The problem is in adding character variables -- without them, numeric variables transpose fine, but with them, numeric variables become character...

novinosrin · Posted 07-05-2018 10:12 AM

@Amanda_Lemon Good morning, Please find the below fix

data have;
input ID	Time	Gender	Race	Age	Test1	Test2	Test3 Test4 $;
datalines;
1	1	0	1	22	50	70	50  A
1	2	0	1	22	60	75	45  B
2	1	1	3	31	35	60	30  A
2	2	1	3	31	80	65	60  C
3	1	1	5	18	25	85	50  C
3	2	1	5	18	45	90	40  A
4	1	1	1	42	10	20	80  C
4	2	1	1	42	60	70	50  D
5	1	0	2	21	50	60	60  F
5	2	0	2	21	55	40	80  B
6	1	1	2	36	70	40	15  F
6	2	1	2	36	80	45	30  C
;
/*separate the needed the vars by read and splitting into 2 sets chars and nums with by group vars*/
data chars(keep=id  gender race age time _char_) nums(keep=id  gender race age time _numeric_);
set have(keep=id  gender race age time test:);
run;
/*parse the nums */
proc transpose data=nums out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want1(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;
/*parse the chars */
proc transpose data=chars out=temp;
by id  gender race age time;
var test:;
run;
proc sort data=temp;
by id gender race age _name_ time;
run;
proc transpose data=temp out=want2(drop=_name_) delim=_;
by id gender race age  ;
var col1;
id _name_ time;
run;
/*combine the tranposed chars and nums into one*/
data final_want;
merge want1 want2;
run;

Reeza · Posted 07-05-2018 10:52 AM

http://www.sascommunity.org/wiki/A_Better_Way_to_Flip_(Transpose)_a_SAS_Dataset

The macro on this page will do what you want, and it's pre-packaged. Worth learning how to use.

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away