DATA Step, Macro, Functions and more

Collapse multiple observations into few by id: retaining all information

Accepted Solution Solved
Reply
N/A
Posts: 1
Accepted Solution

Collapse multiple observations into few by id: retaining all information

I Have data like this

data=bookings

ID   Arrest_charge        Date

1         PC187             3/20/15

1         WIC 122.4        6/15/15

2         PC 71               6/30/12 

2         PC988              6/30/14

2         PC141              7/14/15

3         VC340               8/9/10

4         PC251.1(A)        8/17/11

4         VC241(B)           9/23/11

I want to collapse the data so that each unique ID is one observation, but retain all the information creating new variables that extend horizontally depending on how many arrest records there are for each unique ID. Like this.

ID    Arrest_charge1    Date1        arrest_charge2       date2        arrest_charge3    Date3

1       PC187              3/20/15        WIC 122.4          6/15/15              .                     .

2        PC71               6/30/12          PC988              6/30/14           PC141          7/14/15    

3        VC340             8/9/10               .                         .                   .                  .

4      PC251.1(A)        8/17/11          VC241(B)          9/23/11               .                  .

. or dot indicates a value that is missing because there was not that many charges and dates per booking (i know . is for numerical values and im working with codes.

Please help me im lost. Do i need to write a macro if i dont want to create variables for the max number of charges and dates? I tried a bunch of things but none seem to retain all info just one of the observations.

thanks.

-Matt


Accepted Solutions
Solution
‎07-17-2015 06:37 AM
PROC Star
Posts: 1,759

Re: Collapse multiple observations into few by id: retaining all information

data BOOKINGS;

input ID   ARREST_CHARGE :& $10.        ARREST_DATE : mmddyy.;   

cards;

1         PC187             3/20/15

1         WIC 122.4        6/15/15

2         PC 71               6/30/12

2         PC988              6/30/14

2         PC141              7/14/15

3         VC340               8/9/10

4         PC251.1(A)        8/17/11

4         VC241(B)           9/23/11

run;

data WANT;

  array DATE   [10] 8;

  array CHARGE [10] $10;

  retain DATE: CHARGE:;

  set BOOKINGS;

  by ID;

  if first.ID then call missing(I, of DATE

  • , of CHARGE
  • );
  •   I+1 ;

      DATE  =ARREST_DATE;

      CHARGE=ARREST_CHARGE;

      if last.ID then output;

    run;        

    proc print ;

      var ID DATE1 CHARGE1 DATE2 CHARGE2 DATE3 CHARGE3;

      format DATE: date9.;

      run;

              
    1120MAR2015PC18715JUN2015WIC 122.4.
    2230JUN2012PC 7130JUN2014PC98814JUL2015PC141
    3309AUG2010VC340. .
    4417AUG2011PC251.1(A)23SEP2011VC241(B).


    View solution in original post


    All Replies
    Super User
    Posts: 11,338

    Re: Collapse multiple observations into few by id: retaining all information

    Here's a recent example of something similar:

    https://communities.sas.com/thread/81958

    But HOW will you use the format? It may be easier to leave as original.

    Respected Advisor
    Posts: 4,919

    Re: Collapse multiple observations into few by id: retaining all information

    I agree with . Keep the data in long format and transform last for reporting purposes, if needed

    data report;

    length Charges $80;

    do until(last.id);

        set bookings; by id;

        charges = catx("/ ", charges,

            catt(Arrest_Charge, " [", put(date, yymmdd10.), "]"));

        end;

    keep id charges;

    run;

    proc print data=report noobs; var id charges; run;

    PG

    PG
    Solution
    ‎07-17-2015 06:37 AM
    PROC Star
    Posts: 1,759

    Re: Collapse multiple observations into few by id: retaining all information

    data BOOKINGS;

    input ID   ARREST_CHARGE :& $10.        ARREST_DATE : mmddyy.;   

    cards;

    1         PC187             3/20/15

    1         WIC 122.4        6/15/15

    2         PC 71               6/30/12

    2         PC988              6/30/14

    2         PC141              7/14/15

    3         VC340               8/9/10

    4         PC251.1(A)        8/17/11

    4         VC241(B)           9/23/11

    run;

    data WANT;

      array DATE   [10] 8;

      array CHARGE [10] $10;

      retain DATE: CHARGE:;

      set BOOKINGS;

      by ID;

      if first.ID then call missing(I, of DATE

  • , of CHARGE
  • );
  •   I+1 ;

      DATE  =ARREST_DATE;

      CHARGE=ARREST_CHARGE;

      if last.ID then output;

    run;        

    proc print ;

      var ID DATE1 CHARGE1 DATE2 CHARGE2 DATE3 CHARGE3;

      format DATE: date9.;

      run;

              
    1120MAR2015PC18715JUN2015WIC 122.4.
    2230JUN2012PC 7130JUN2014PC98814JUL2015PC141
    3309AUG2010VC340. .
    4417AUG2011PC251.1(A)23SEP2011VC241(B).


    🔒 This topic is solved and locked.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 3 replies
    • 254 views
    • 3 likes
    • 4 in conversation