BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Rahim_221
Calcite | Level 5

Hello,

 

I have a dataset that looks like this (sorted by ID and startdate):

ID          Startdate             Stopdate             Result

1           02/03/2011           02/05/2011            1

1           03/22/2016           .                             3

1           05/13/2016          05/13/2016             4

2           11/11/2011           11/13/2011             2

2           03/23/2012           03/23/2012            1

2           09/10/2016           .                             4

2            02/02/2019          02/02/2019             2

I want the data transposed like this, so each Id has only one row:

 

ID      startdate1    stopdate1   result1       startdate2    stopdate2  result2   startdate3 stopdate3  result3  startdate4  etc.

1        02/03/2011    02/05/2011   1          03/22/2016           .            3        05/13/2016  05/13/2016     4

2        11/11/2011   11/13/2011     2       03/23/2012    03/23/2012     1        09/10/2016           .             4  02/02/2019        

 

I tried using data step for transposing but I end up having several transposed datasets that confuse me. I need help in creating a code that can get me the above dataset smoothly as I have to repeat this code in several different datasets.

 

Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

The first question to ask is what do you expect to do with that data set that you cannot do with the current structure?

 

The varying number of resulting variables means that any code using such a data structure is hard to maintain if you have more data later.

 

If the only purpose is for a person to read that data then perhaps a report is what you actually want such as:

data have;
  input ID   Startdate :mmddyy10.  Stopdate :mmddyy10.  Result;
  format startdate stopdate mmddyy10.;
datalines;
1           02/03/2011           02/05/2011            1
1           03/22/2016           .                     3
1           05/13/2016          05/13/2016             4
2           11/11/2011           11/13/2011            2
2           03/23/2012           03/23/2012            1
2           09/10/2016           .                     4
2            02/02/2019          02/02/2019            2
;

proc sort data=have;
   by id startdate.;
run;

data need;
   set have;
   by id;
   retain resultorder;
   if first.id then resultorder=1;
   else resultorder+1;
run;
proc report data=need;
   columns id resultorder,( startdate stopdate result);
   define id /group;
   define resultorder/across "Result order";
run;

Please note the first data step is the preferred way to show example data. If you provide a data step then you have controlled the variable properties which are important.

Whatever you want you likely need to add an order variable, which I called ResultOrder, to identify the order.

View solution in original post

5 REPLIES 5
ballardw
Super User

The first question to ask is what do you expect to do with that data set that you cannot do with the current structure?

 

The varying number of resulting variables means that any code using such a data structure is hard to maintain if you have more data later.

 

If the only purpose is for a person to read that data then perhaps a report is what you actually want such as:

data have;
  input ID   Startdate :mmddyy10.  Stopdate :mmddyy10.  Result;
  format startdate stopdate mmddyy10.;
datalines;
1           02/03/2011           02/05/2011            1
1           03/22/2016           .                     3
1           05/13/2016          05/13/2016             4
2           11/11/2011           11/13/2011            2
2           03/23/2012           03/23/2012            1
2           09/10/2016           .                     4
2            02/02/2019          02/02/2019            2
;

proc sort data=have;
   by id startdate.;
run;

data need;
   set have;
   by id;
   retain resultorder;
   if first.id then resultorder=1;
   else resultorder+1;
run;
proc report data=need;
   columns id resultorder,( startdate stopdate result);
   define id /group;
   define resultorder/across "Result order";
run;

Please note the first data step is the preferred way to show example data. If you provide a data step then you have controlled the variable properties which are important.

Whatever you want you likely need to add an order variable, which I called ResultOrder, to identify the order.

Rahim_221
Calcite | Level 5

Thank you so much for replying. I tried your code and it worked as a report. The code creates what I wanted in a different way (which I personally like) but for data management purposes, I want the data "Need" itself to be transposed exactly as I described in my post. I have tried different codes but I end up getting 3 transposed columns each in one row, which is not really what I want to have as an outcome.

 

Thank you

ballardw
Super User

@Rahim_221 wrote:

Thank you so much for replying. I tried your code and it worked as a report. The code creates what I wanted in a different way (which I personally like) but for data management purposes, I want the data "Need" itself to be transposed exactly as I described in my post. I have tried different codes but I end up getting 3 transposed columns each in one row, which is not really what I want to have as an outcome.

 

Thank you


You can create data set with proc report:

 

proc report data=need out=reporttable;
   columns id resultorder,( startdate stopdate result);
   define id /group;
   define resultorder/across "Result order";
run;

HOWEVER the variable names in this type of out put will be _C2_ _C3_ _C4_ . You could rename them

 

One of the obnoxious problems with wanting specific column orders is that you have to spend a lot of time futzing with names.

You could transpose each variable like this:

proc transpose data=need out=start (drop=_name_)
      prefix=Startdate;
   by id;
   id resultorder;
   var startdate;
run;

(an exercise for the interested reader to do the other variables)

and then combine the sets. However the easy way to combine the sets by MERGE will not have the column order you want.

Solutions to get unspecified numbers of created variables in order get into the macro language which often means that minor changes in the data can make a lot of work updating code.

Ksharp
Super User
data have; 
input ID          (Startdate             Stopdate             Result) (:$40.);
cards;
1           02/03/2011           02/05/2011            1
1           03/22/2016           .                             3
1           05/13/2016          05/13/2016             4
2           11/11/2011           11/13/2011             2
2           03/23/2012           03/23/2012            1
2           09/10/2016           .                             4
2            02/02/2019          02/02/2019             2
;

data temp;
 set have;
 by id;
 if first.id then n=0;
 n+1;
run;
proc sql noprint;
select distinct catt('temp(where=(n=',n,') rename=(Startdate=Startdate',n,' Stopdate=Stopdate',n,' Result=Result',n,'))')
 into : merge separated by ' ' 
 from temp;
quit;

data want;
 merge &merge.;
 by id;
 drop n;
run;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1232 views
  • 1 like
  • 3 in conversation