Solved: Re: Transposing multiple columns dataset into one row by subject ID

Rahim_221 · Posted 03-23-2023 10:26 AM

Hello,

I have a dataset that looks like this (sorted by ID and startdate):

ID Startdate Stopdate Result

1 02/03/2011 02/05/2011 1

1 03/22/2016 . 3

1 05/13/2016 05/13/2016 4

2 11/11/2011 11/13/2011 2

2 03/23/2012 03/23/2012 1

2 09/10/2016 . 4

2 02/02/2019 02/02/2019 2

I want the data transposed like this, so each Id has only one row:

ID startdate1 stopdate1 result1 startdate2 stopdate2 result2 startdate3 stopdate3 result3 startdate4 etc.

1 02/03/2011 02/05/2011 1 03/22/2016 . 3 05/13/2016 05/13/2016 4

2 11/11/2011 11/13/2011 2 03/23/2012 03/23/2012 1 09/10/2016 . 4 02/02/2019

I tried using data step for transposing but I end up having several transposed datasets that confuse me. I need help in creating a code that can get me the above dataset smoothly as I have to repeat this code in several different datasets.

Thanks in advance

ballardw · Posted 03-23-2023 10:46 AM

The first question to ask is what do you expect to do with that data set that you cannot do with the current structure?

The varying number of resulting variables means that any code using such a data structure is hard to maintain if you have more data later.

If the only purpose is for a person to read that data then perhaps a report is what you actually want such as:

data have;
  input ID   Startdate :mmddyy10.  Stopdate :mmddyy10.  Result;
  format startdate stopdate mmddyy10.;
datalines;
1           02/03/2011           02/05/2011            1
1           03/22/2016           .                     3
1           05/13/2016          05/13/2016             4
2           11/11/2011           11/13/2011            2
2           03/23/2012           03/23/2012            1
2           09/10/2016           .                     4
2            02/02/2019          02/02/2019            2
;

proc sort data=have;
   by id startdate.;
run;

data need;
   set have;
   by id;
   retain resultorder;
   if first.id then resultorder=1;
   else resultorder+1;
run;
proc report data=need;
   columns id resultorder,( startdate stopdate result);
   define id /group;
   define resultorder/across "Result order";
run;

Please note the first data step is the preferred way to show example data. If you provide a data step then you have controlled the variable properties which are important.

Whatever you want you likely need to add an order variable, which I called ResultOrder, to identify the order.

View solution in original post

ballardw · Posted 03-23-2023 10:46 AM

The first question to ask is what do you expect to do with that data set that you cannot do with the current structure?

The varying number of resulting variables means that any code using such a data structure is hard to maintain if you have more data later.

If the only purpose is for a person to read that data then perhaps a report is what you actually want such as:

data have;
  input ID   Startdate :mmddyy10.  Stopdate :mmddyy10.  Result;
  format startdate stopdate mmddyy10.;
datalines;
1           02/03/2011           02/05/2011            1
1           03/22/2016           .                     3
1           05/13/2016          05/13/2016             4
2           11/11/2011           11/13/2011            2
2           03/23/2012           03/23/2012            1
2           09/10/2016           .                     4
2            02/02/2019          02/02/2019            2
;

proc sort data=have;
   by id startdate.;
run;

data need;
   set have;
   by id;
   retain resultorder;
   if first.id then resultorder=1;
   else resultorder+1;
run;
proc report data=need;
   columns id resultorder,( startdate stopdate result);
   define id /group;
   define resultorder/across "Result order";
run;

Please note the first data step is the preferred way to show example data. If you provide a data step then you have controlled the variable properties which are important.

Whatever you want you likely need to add an order variable, which I called ResultOrder, to identify the order.

Rahim_221 · Posted 03-23-2023 11:10 AM

Thank you so much for replying. I tried your code and it worked as a report. The code creates what I wanted in a different way (which I personally like) but for data management purposes, I want the data "Need" itself to be transposed exactly as I described in my post. I have tried different codes but I end up getting 3 transposed columns each in one row, which is not really what I want to have as an outcome.

Thank you

ballardw · Posted 03-23-2023 11:23 AM

@Rahim_221 wrote:

Thank you so much for replying. I tried your code and it worked as a report. The code creates what I wanted in a different way (which I personally like) but for data management purposes, I want the data "Need" itself to be transposed exactly as I described in my post. I have tried different codes but I end up getting 3 transposed columns each in one row, which is not really what I want to have as an outcome.

Thank you

You can create data set with proc report:

proc report data=need out=reporttable;
   columns id resultorder,( startdate stopdate result);
   define id /group;
   define resultorder/across "Result order";
run;

HOWEVER the variable names in this type of out put will be _C2_ _C3_ _C4_ . You could rename them

One of the obnoxious problems with wanting specific column orders is that you have to spend a lot of time futzing with names.

You could transpose each variable like this:

proc transpose data=need out=start (drop=_name_)
      prefix=Startdate;
   by id;
   id resultorder;
   var startdate;
run;

(an exercise for the interested reader to do the other variables)

and then combine the sets. However the easy way to combine the sets by MERGE will not have the column order you want.

Solutions to get unspecified numbers of created variables in order get into the macro language which often means that minor changes in the data can make a lot of work updating code.

Rahim_221 · Posted 03-23-2023 11:10 PM

Thank you 🙏

Ksharp · Posted 03-24-2023 07:57 AM

data have; 
input ID          (Startdate             Stopdate             Result) (:$40.);
cards;
1           02/03/2011           02/05/2011            1
1           03/22/2016           .                             3
1           05/13/2016          05/13/2016             4
2           11/11/2011           11/13/2011             2
2           03/23/2012           03/23/2012            1
2           09/10/2016           .                             4
2            02/02/2019          02/02/2019             2
;

data temp;
 set have;
 by id;
 if first.id then n=0;
 n+1;
run;
proc sql noprint;
select distinct catt('temp(where=(n=',n,') rename=(Startdate=Startdate',n,' Stopdate=Stopdate',n,' Result=Result',n,'))')
 into : merge separated by ' ' 
 from temp;
quit;

data want;
 merge &merge.;
 by id;
 drop n;
run;

Transposing multiple columns dataset into one row by subject ID

Re: Transposing multiple columns dataset into one row by subject ID

Re: Transposing multiple columns dataset into one row by subject ID

Re: Transposing multiple columns dataset into one row by subject ID

Re: Transposing multiple columns dataset into one row by subject ID

Re: Transposing multiple columns dataset into one row by subject ID

Re: Transposing multiple columns dataset into one row by subject ID

Register Today!