DATA Step, Macro, Functions and more

Longitudinal to wide

Accepted Solution Solved
Reply
Super Contributor
Posts: 647
Accepted Solution

Longitudinal to wide

For  data shown below, I need to convert the data based on FSDT. A contract can have multiple HCCs.But on the same FSDT if it has more than one HCC that needs to be displayed as shown in second data layout.Any help is appreciated.

contract        FSDT              DxDEN          HCC
1001        01/01/2010            43002          HCC-108
1001        01/01/2010            43004          HCC-109
1001        03/01/2010            43004          HCC-109
1002        03/01/2010            43002          HCC-108
1003        04/01/2010            43004          HCC-109
1003        04/01/2010            43006          HCC-90
1003        04/01/2010            43002          HCC-108
1003        05/01/2010            43002          HCC-108

Contract    FSDT                  Dxden           HCC1           HCC2          HCC3
1001        01/01/2010            43002          HCC-108       HCC-109
1001        03/01/2010            43004          HCC-109
1002        03/01/2010            43002          HCC-108
1003        04/01/2010            43006          HCC-90        HCC-108        HCC-109
1003        05/01/2010            43002          HCC-108


Accepted Solutions
Solution
‎02-08-2012 05:47 PM
Super User
Super User
Posts: 6,502

Longitudinal to wide

Proc Transpose can do that. Getting to pick only the first value of DXDEN is a little tricky though.

proc transpose prefix=hcc out=want(where=(_name_ ne ' ')) ;

  by contract fsdt ;

  copy dxden;

  var hcc;

run;

View solution in original post


All Replies
Solution
‎02-08-2012 05:47 PM
Super User
Super User
Posts: 6,502

Longitudinal to wide

Proc Transpose can do that. Getting to pick only the first value of DXDEN is a little tricky though.

proc transpose prefix=hcc out=want(where=(_name_ ne ' ')) ;

  by contract fsdt ;

  copy dxden;

  var hcc;

run;

Super Contributor
Posts: 1,636

Longitudinal to wide

data have;

informat fsdt mmddyy10.;

format fsdt mmddyy10.;

input contract  :FSDT : DxDEN : HCC $8.;

cards;

1001        01/01/2010            43002  HCC-108

1001        01/01/2010            43004  HCC-109

1001        03/01/2010            43004  HCC-109

1002        03/01/2010            43002  HCC-108

1003        04/01/2010            43004  HCC-109

1003        04/01/2010            43006  HCC-90

1003        04/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

;

data temp;

  set have;

   by contract FSDT;

  count+first.fsdt;

proc sort;by count;

data want(drop=hcc count ct);

set temp;

by count;

array h{*}$ hcc1-hcc3;

retain  contract FSDT DxDEN hcc:;

if first.count then ct=0;

ct+1;

h{ct}=hcc;

if last.count then do;output; call missing(of h{*});

end;

run;

proc print;run;

Obs          fsdt             contract    DxDEN     hcc1          hcc2            hcc3

       1     01/01/2010      1001      43004    HCC-108    HCC-109

       2     03/01/2010      1001      43004    HCC-109

       3     03/01/2010      1002      43002    HCC-108

       4     04/01/2010      1003      43002    HCC-109    HCC-90     HCC-108

       5     05/01/2010      1003      43002    HCC-108

Contributor
Posts: 23

Re: Longitudinal to wide

Hello SASPhile, As Tom indicated proc transpose plus a data step should get the job done as well.

data contract;  

input contract FSDT mmddyy10. DxDEN HCC $;   format fsdt mmddyy10.;

datalines;

1001 01/01/2010 43002 HCC-108

1001 01/01/2010 43004 HCC-109

1001 03/01/2010 43004 HCC-109

1002 03/01/2010 43002 HCC-108

1003 04/01/2010 43004 HCC-109

1003 04/01/2010 43006 HCC-90

1003 04/01/2010 43002 HCC-108

1003 05/01/2010 43002 HCC-108

;

run;

proc transpose data = contract out=twide;  

by contract fsdt;  

copy dxden;  

var hcc;

run;

data wide (drop=_name_);  

set twide;  

if _name_ = '' then delete;  

rename col1 = HCC1 col2 = HCC2 col3 = HCC3;

run;

HTH,

Rich

Respected Advisor
Posts: 3,124

Longitudinal to wide

Like Tom said, choose the right Dxden can be tricky.

Here is another version based on LinLin's code, it managed to meet op's need. One thing I don't like is that it is still involving some hard coding.

data have;

informat fsdt mmddyy10.;

format fsdt mmddyy10.;

input contract  :FSDT : DxDEN : HCC $8.;

cards;

1001        01/01/2010            43002  HCC-108

1001        01/01/2010            43004  HCC-109

1001        03/01/2010            43004  HCC-109

1002        03/01/2010            43002  HCC-108

1003        04/01/2010            43004  HCC-109

1003        04/01/2010            43006  HCC-90

1003        04/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

;

data have;

set have;

order=input(substr(hcc,anyalpha(hcc,-10)+2),3.);

run;

proc sort data=have;

by contract fsdt order;

run;

data want (drop=hcc dxden order rename=dx=DxDen);

  do _n_=1 by 1 until (last.fsdt);

      set have;

      by contract fsdt order;

      array h $ hcc1-hcc3;

      if first.fsdt then dx=dxden;

      h(_n_)=hcc;

  end;

  output;

run;

proc print;run;

Super Contributor
Posts: 647

Longitudinal to wide

Thanks Hai.Kuo

Super Contributor
Posts: 647

Longitudinal to wide

Hi Hai.Kuo,

For the data below HCC is missing for the last record.If I apply the same logic as yours,I'm getting an error "

Array subscript out of range".

data have;

informat fsdt mmddyy10.;

format fsdt mmddyy10.;

input contract  :FSDT : DxDEN : HCC $8.;

cards;

1001        01/01/2010            43002  HCC-108

1001        01/01/2010            43004  HCC-109

1001        03/01/2010            43004  HCC-109

1002        03/01/2010            43002  HCC-108

1003        04/01/2010            43004  HCC-109

1003        04/01/2010            43006  HCC-90

1003        04/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1004        05/01/2010            43008 

;

Super User
Super User
Posts: 6,502

Longitudinal to wide

The only way to get an array out of bound would be if there were more than 3 observations for the same CONTRACT/FSDT combination.

You might want to change the upperbound and the ARRAY statement if 3 is not enough.

This is why PROC TRANSPOSE is more convenient.  You do not need to count in advance how many possible values there are.

Super Contributor
Posts: 647

Longitudinal to wide

I wish there is a wildcard kind of operator to define upper bound.

Respected Advisor
Posts: 3,124

Longitudinal to wide

That is odd. I can't repeat your problem. You need to make sure that there is no empty rows in your 'have' table which may require you to expand  array range.

Run this one to decide how many elements you are going to need in your array:

proc sql;

select max(ct) from

(select (count(*)) as ct from have

group by contract,fsdt)

;

quit;

HTH,

Haikuo

Super Contributor
Posts: 1,636

Re: Longitudinal to wide

I don't have any problem to run your code with the new dataset:

data have;

informat fsdt mmddyy10.;

format fsdt mmddyy10.;

input contract  :FSDT : DxDEN : HCC $8.;

cards;

1001        01/01/2010            43002  HCC-108

1001        01/01/2010            43004  HCC-109

1001        03/01/2010            43004  HCC-109

1002        03/01/2010            43002  HCC-108

1003        04/01/2010            43004  HCC-109

1003        04/01/2010            43006  HCC-90

1003        04/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1004        05/01/2010            43008

;

data have;

set have;

order=input(substr(hcc,anyalpha(hcc,-10)+2),3.);

run;

proc sort data=have;

by contract fsdt order;

run;

data want (drop=hcc dxden order rename=dx=DxDen);

  do _n_=1 by 1 until (last.fsdt);

      set have;

      by contract fsdt order;

      array h $ hcc1-hcc3;

      if first.fsdt then dx=dxden;

      h(_n_)=hcc;

  end;

  output;

run;

proc print;run;

Obs         fsdt   contract    hcc1      hcc2      hcc3     DxDen

1    01/01/2010     1001     HCC-108   HCC-109             43002

2    03/01/2010     1001     HCC-109                       43004

3    03/01/2010     1002     HCC-108                       43002

4    04/01/2010     1003     HCC-90    HCC-108   HCC-109   43006

5    05/01/2010     1003     HCC-108                       43002

6    05/01/2010     1004                                   43008

Super Contributor
Posts: 647

Longitudinal to wide

This works fine. As Tom mentioned, if there are more than three records per FSDT then it will go out of range.For instance the following gives an error.

data have;

informat fsdt mmddyy10.;

format fsdt mmddyy10.;

input contract  :FSDT : DxDEN : HCC $8.;

cards;

1001        01/01/2010            43002  HCC-108

1001        01/01/2010            43004  HCC-109

1001        03/01/2010            43004  HCC-109

1002        03/01/2010            43002  HCC-108

1003        04/01/2010            43004  HCC-109

1003        04/01/2010            43006  HCC-90

1003        04/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1004        05/01/2010            43008

;

Super Contributor
Posts: 1,636

Longitudinal to wide

The modified code takes care of the "out of range" problem.

data have;

informat fsdt mmddyy10.;

format fsdt mmddyy10.;

input contract  :FSDT : DxDEN : HCC $8.;

cards;

1001        01/01/2010            43002  HCC-108

1001        01/01/2010            43004  HCC-109

1001        03/01/2010            43004  HCC-109

1002        03/01/2010            43002  HCC-108

1003        04/01/2010            43004  HCC-109

1003        04/01/2010            43006  HCC-90

1003        04/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1003        05/01/2010            43002  HCC-108

1004        05/01/2010            43008

;

proc sql noprint;

  select max(groups) into: maxgroup

    from (select count(*) as groups from have group by contract,fsdt);

quit;

%let maxgroup=&maxgroup;

data have;

set have;

order=input(substr(hcc,anyalpha(hcc,-10)+2),3.);

run;

proc sort data=have;

by contract fsdt order;

run;

data want (drop=hcc dxden order rename=dx=DxDen);

  do _n_=1 by 1 until (last.fsdt);

      set have;

      by contract fsdt order;

      array h $ hcc1-hcc&maxgroup;

      if first.fsdt then dx=dxden;

      h(_n_)=hcc;

  end;

  output;

run;

proc print;run;

PROC Star
Posts: 7,364

Longitudinal to wide

Since this thread continues to live, I'll propose another alternative:

proc sql noprint;

   select max(obs) into: obs

     from (select count(*) as obs

       from have group by contract,fsdt)

  ;

quit;

proc summary data=have nway;

   class contract fsdt;

   output out=want(drop=_Smiley Happy

      idgroup(out[&obs](DxDEN HCC)=)

      ;

run;

data want (drop=dxDEN_Smiley Happy;

  set want (rename=(DxDEN_1=DxDen));

run;

Respected Advisor
Posts: 3,777

Longitudinal to wide

You can use as many IDGROUP options as necessary.  The default is to pick the FIRST value, as is desired here.

proc summary data=have nway;

   class contract fsdt;

   output out=want(drop=_Smiley Happy

      idgroup(out[&obs](HCC)=)

     idgroup(out(dxden)=)

      ;

   run;

I think Tom's use of COPY statement is really nice.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 17 replies
  • 372 views
  • 10 likes
  • 8 in conversation