Solved: Re: Data transpose and rearrange the columns/values

Bintang18 · Posted 08-16-2019 12:11 PM

Hello,

My data set :

studentid test_missed districtcode reason_not_tested

2 English 56 22

2 Algebra 56 22

2 History 56 22

2 Biology 56 22

2 Writing 71 22

3 History 89 19

4 English 62 17

5 English 77 20

5 Biology 77 20

5 Algebra 79 20

6 Writing 80 22

I transposed the data so each student only has 1 row. I have many more variables, but only 3 vars need to be transposed. I used the method presented in Paper 2785-2015 "Transpose Dataset by MERGE" by Shan, Kastin, Tabachneck, which is a 'life saver' for me. I was able to transpose but I just could not rearrange the test_missed (number).

data snt19;
set snt19;
by studentid ;
if first.studentid then n=0;
n+1;
run;

proc sql noprint ;
select distinct catt ('snt19 (where=(n=',left(put(n,8.)),
') rename= (districtcode=districtcode',left(put(n,8.)),
' reason_not_tested=reason_not_tested',left(put(n,8.)),
' test_missed=test_missed',left(put(n,8.)),

'))')
into :mer separated by ' '
from snt19 ;
quit;

data want;
merge &mer ;
by studentid ;
drop n;

run;

id districtcode1 reason_not_tested1 test_missed1 districtcode2 reason_not_tested2 test_missed2 districtcode3 ..... test_3

2 56 22 English 56 22 Algebra 56 History

5 77 20 English 77 20 Biology 79 Algebra

and so on.

what I need is something like this:

id test_missed1 test_missed2 test_missed3 test_missed4 test_missed5

2 Algebra English Biology History Writing

3 History

4 English

5 Algebra English Biology

6 Writing

I hope this makes sense. Thanks all.

BScholtz

Tom · Posted 08-16-2019 03:41 PM

Does not look like you want alphabetical order because you have English before Biology.

You will need to create a new variable you can use to sort.

Do you have the order already in a dataset?

If the list is known you might want to make a format (or perhaps an INformat).

data have ;
  input studentid $ test_missed :$20. districtcode reason_not_tested ;
cards;
 2  English  56  22
 2  Algebra  56  22
 2  History  56  22
 2  Biology  56  22
 2  Writing  71  22
 3  History  89  19
 4  English  62  17
 5  English  77  20
 5  Biology  77  20
 5  Algebra  79  20
 6  Writing  80  22
;

proc format ;
invalue test_order 
 'Algebra'=1
 'English'=2
 'Biology'=3
 'History'=4
 'Writing'=5
 other=999
;
run;

data middle;
 set have;
 test_order=input(test_missed,test_order.);
run;
proc sort data=middle;
  by studentid test_order;
run;

proc transpose data=middle out=want(drop=_name_) prefix=test_missed ;
  by studentid;
  var test_missed;
run;

View solution in original post

Reeza · Posted 08-16-2019 12:29 PM

Did you try a straight PROC TRANSPOSE? That's what it looks like to me.

Transposing data tutorials:
Long to Wide:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-long-to-wide-using-proc-transpose/

https://stats.idre.ucla.edu/sas/modules/reshaping-data-long-to-wide-using-the-data-step/

Wide to Long:
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/

https://stats.idre.ucla.edu/sas/modules/reshaping-data-wide-to-long-using-a-data-step/

And sometimes a double transpose is needed for extra wide data sets:
https://gist.github.com/statgeek/2321b6f62ab78d5bf2b0a5a8626bd7cd

novinosrin · Posted 08-16-2019 12:30 PM

data have;
input studentid      test_missed  : $20.     districtcode    reason_not_tested;
cards;
2                  English                56                           22
2                   Algebra               56                            22
2                   History                  56                          22
2                     Biology               56                           22
2                     Writing                 71                          22
3                     History                 89                          19
4                     English                  62                          17
5                   English                    77                          20
5                    Biology                   77                          20
5                    Algebra                  79                          20
6                   Writing                    80                            22
;

proc transpose data=have out=want prefix=Test_missed;
by studentid;
var test_missed;
run;

ballardw · Posted 08-16-2019 12:51 PM

I don't think you are clearly stating your objective.

It the idea is to have the names of the tests missed in alphabetical order I think you may be doing way more code than is needed.

It helps to provide data in the form of a data step so we can test things.

For your example data this provides the desired result

data have;
  input studentid test_missed :$10. districtcode reason_not_tested;
datalines;
2 English 56 22
2 Algebra 56 22
2 History 56 22
2 Biology 56 22
2 Writing 71 22
3 History 89 19
4 English 62 17
5 English 77 20
5 Biology 77 20
5 Algebra 79 20
6 Writing 80 22
;
proc sort data=have;
   by  studentid test_missed;
run;
proc transpose data=have out=temp
   prefix=test_missed;
by  studentid;
var test_missed;
run;

If by chance your student ID duplicates across districtcode values then use

BY DISTRICTCODE STUDENTID;

I the Sort and Transpose steps.

art297 · Posted 08-16-2019 01:05 PM

Like others have said, I don't think you stated exactly what you want. If you simply want to reorder the WANT file so that studentid is followed by districtcode1-districtcode5, reason_not_tested1-reason_not_tested5, and test_missed1-test_missed5, then you could do it by adding a proc sql and data step to the end of your current code. i.e.,

data snt19;
  input studentid test_missed $ districtcode reason_not_tested;
  cards;
2                  English                56                           22
2                   Algebra               56                            22
2                   History                  56                          22
2                     Biology               56                           22
2                     Writing                 71                          22
3                     History                 89                          19
4                     English                  62                          17
5                   English                    77                          20
5                    Biology                   77                          20
5                    Algebra                  79                          20
6                   Writing                    80                            22
;

data snt19;
set snt19;
by studentid ;
if first.studentid then n=0;
n+1;
run;

proc sql noprint ;
select distinct catt ('snt19 (where=(n=',left(put(n,8.)),
') rename= (districtcode=districtcode',left(put(n,8.)),
' reason_not_tested=reason_not_tested',left(put(n,8.)),
' test_missed=test_missed',left(put(n,8.)),
'))')
into :mer separated by ' '
from snt19 ;
quit;
 
data want;
merge &mer ;
by studentid ;
drop n;
run; 

proc sql noprint;
    select name
      into :retain separated by ' '
        from dictionary.columns
          where libname='WORK' and
                memname='WANT' and
                name ne 'studentid'
            order by name
  ;
quit;

data want;
  retain studentid &retain.;
  set want;
run;

Art, CEO, AnalystFinder.com

Bintang18 · Posted 08-16-2019 03:25 PM

I'll try to make myself clearer. The main objective is to rearrange test_missed so, like studentid 2, after transposed his row should be :

test_missed1 test_missed2 test_missed3 test_missed4 test_missed5

Algebra English Biology History Writing

and NOT

test_missed1 test_missed2 test_missed3 test_missed4 test_missed5

English Algebra History Biology Writing

For some reasons, the agency wants test_missed to follow that order. It has nothing to do with alphabetical order, studentid or anything else. If student A missed only History then his test_missed1 is History. If student B missed English and Writing then his test_missed1 is English and test_missed2 is Writing.

I tried if..then..method but it was so convoluted, I got nowhere.

thank you again!

Reeza · Posted 08-16-2019 03:27 PM

So what are the rules for the ordering?

Bintang18 · Posted 08-16-2019 03:46 PM

Apparently they've been doing this for a while, follow a pattern (?) or maybe each course has different weigh. First time doing this.

Tom · Posted 08-16-2019 03:41 PM

Does not look like you want alphabetical order because you have English before Biology.

You will need to create a new variable you can use to sort.

Do you have the order already in a dataset?

If the list is known you might want to make a format (or perhaps an INformat).

data have ;
  input studentid $ test_missed :$20. districtcode reason_not_tested ;
cards;
 2  English  56  22
 2  Algebra  56  22
 2  History  56  22
 2  Biology  56  22
 2  Writing  71  22
 3  History  89  19
 4  English  62  17
 5  English  77  20
 5  Biology  77  20
 5  Algebra  79  20
 6  Writing  80  22
;

proc format ;
invalue test_order 
 'Algebra'=1
 'English'=2
 'Biology'=3
 'History'=4
 'Writing'=5
 other=999
;
run;

data middle;
 set have;
 test_order=input(test_missed,test_order.);
run;
proc sort data=middle;
  by studentid test_order;
run;

proc transpose data=middle out=want(drop=_name_) prefix=test_missed ;
  by studentid;
  var test_missed;
run;

Bintang18 · Posted 08-16-2019 04:08 PM

Thanks Tom, this is exactly what I need.

Ready to join fellow brilliant minds for the SAS Hackathon?