Help using Base SAS procedures

Proc transpose, possibly other solution

Accepted Solution Solved
Reply
Valued Guide
Posts: 858
Accepted Solution

Proc transpose, possibly other solution

It took me a while to get this data together and now it seems the clients display preference is going to be a huge undertaking.  I'm not sure if transpose will put this together easily or if there is another way to get this but here it goes, below is a small portion of a dataset, power_id is a unique identifier and it's sorted by power_id record_date.

Here is what I have:

POWER_IDRECORD_DATEpid_default_reasoncure_countACCT_DISCLOSURE_STATUS_FLGPD_CDDEL_DAYSACCT_CURRENT_CHARGE_OFF_AMTpass_date
000019616206/30/20133P**00.00
000019616207/31/20134P**00.00
000019616208/31/20135P**00.00
000019616209/30/20136P**00.0009/30/2013

I've attached what I want because it didn't look very nice pasted.  I had to use a csv because the excel format was too large.  You'll notice there are different columns before and after September 2013.  Any help will be appreciated.

Attachment

Accepted Solutions
Solution
‎02-27-2015 12:20 PM
PROC Star
Posts: 7,366

Re: Proc transpose, possibly other solution

Something like the following comes close:

data have;

  infile cards dlm='09'x;

  informat power_id $10.;

  informat record_date mmddyy10.;

  format record_date mmddyy10.;

  informat pid_default_reason $1.;

  informat cure_count 1.;

  informat ACCT_DISCLOSURE_STATUS_FLG $1.;

  informat PD_CD $2.;

  informat DEL_DAYS 8.;

  informat ACCT_CURRENT_CHARGE_OFF_AMT 8.;

  informat pass_date mmddyy10.;

  format pass_date mmddyy10.;

  input POWER_ID RECORD_DATE default_reason cure_count

        ACCT_DISCLOSURE_STATUS_FLG PD_CD DEL_DAYS

        ACCT_CURRENT_CHARGE_OFF_AMT pass_date;

  cards;

0000196162 06/30/2013 . 3 P ** 0 0.00

0000196162 07/31/2013 . 4 P ** 0 0.00

0000196162 08/31/2013 . 5 P ** 0 0.00

0000196162 09/30/2013 . 6 P ** 0 0.00 09/30/2013

;

data have;

  retain ACCT_DISCLOSURE_STATUS_FLG PD_CD DEL_DAYS

         ACCT_CURRENT_CHARGE_OFF_AMT

         default_reason cure_count;

  set have;

run;

data have (drop=_Smiley Happy;

  format pass_date mmddyy10.;

  do until (last.power_id);

    set have (rename=(pass_date=_pass_date));

    by power_id;

    if first.power_id then call missing(pass_date);

    if not missing(_pass_date) then pass_date=_pass_date;

  end;

  do until (last.power_id);

    set have (rename=(pass_date=_pass_date));

    by power_id;

    output;

  end;

run;

proc transpose data=have out=tall ;

by power_id record_date;

var ACCT_DISCLOSURE_STATUS_FLG--cure_count;

copy pass_date;

run;

proc transpose data=tall out=want (where=(not missing(_name_)));

by power_id;

id _name_ record_date;

var col1;

copy pass_date;

run;

data want;

  set want (drop=_Smiley Happy;

run;

data need;

  set want (keep=pd_cd: rename=(pd_cd06_30_2013=pd_cd_06_30_2013

    pd_cd07_31_2013=pd_cd_07_31_2013

    pd_cd08_31_2013=pd_cd_08_31_2013

    pd_cd09_30_2013=pd_cd_09_30_2013 ));

run;

data want;

  set want;

  set need;

run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,430

Re: Proc transpose, possibly other solution

Hi,

I would question the reasoning behind transposing the data.  In your example you have four or five dates, what happens when you have 50, that would be 50 * 20 or so variables.  You are then talking about working with thousands of variables.  Whats the data to be used for?  If its for reporting, does anyone seriously scroll the window across in Excel to see all those 1000's of columns, I somehow doubt it.  If its for further analysis then the normalised structure would presumably be easier to work with.  There are several ways of reporting the data in chunks, which may be preferable, for instance having a sheet for each quarter with the data normalised.  Perhaps just blocking it out.

If however you have looked at the above and it is still required, then you could transpose each variable using proc transpose, but its a bit long winded.  You could use arrays:

SAS Learning Module: Reshaping data long to wide using the data step

Which would shrink your code somewhat.

So its technically possible, but do you really want to go there?

Valued Guide
Posts: 858

Re: Proc transpose, possibly other solution

The dates will only go from June to December of the same year, at the longest.  I agree that this is not the best way to display the data but that is what is requested.  I would like to use arrays but am not very familiar.  I am updating the info with proc transpose but not getting it quite right, that's why I posted here.

Super User
Super User
Posts: 7,430

Re: Proc transpose, possibly other solution

Well, for the arrays, here is a small example.  abcd all get transposed up, {1} references the month 06, {2} references 07.  (am finished for the week now, so hope this helps.)

data have;

  infile datalines dlm=",";

  input POWER_ID $ RECORD_DATE $ A B $ C D;

datalines;

0000196162,06/30/2013,3,P **,0,0.00

0000196162,07/31/2013,4,P **,0,0.00

;

run;

   

data want (drop=a b c d);

  set have;

  by power_id;

  retain _a1-_a2 _b1-_b2 _c1-_c2 _d1-_d2;

  array _a{2} 8.;

  array _b{2} $8.;

  array _c{2} 8.;

  array _d{2} 8.;

  select(substr(record_date,1,2));

    when("06") ind=1;

    when("07") ind=2;

  end;

  _a{ind}=a;

  _b{ind}=b;

  _c{ind}=c;

  _d{ind}=d;

  if last.power_id then output;

run;

Valued Guide
Posts: 858

Re: Proc transpose, possibly other solution

how about with the one ID variable per line and everything leading with date out to the right?  I'm trying that with multiple transposes but still not quite getting there.

Solution
‎02-27-2015 12:20 PM
PROC Star
Posts: 7,366

Re: Proc transpose, possibly other solution

Something like the following comes close:

data have;

  infile cards dlm='09'x;

  informat power_id $10.;

  informat record_date mmddyy10.;

  format record_date mmddyy10.;

  informat pid_default_reason $1.;

  informat cure_count 1.;

  informat ACCT_DISCLOSURE_STATUS_FLG $1.;

  informat PD_CD $2.;

  informat DEL_DAYS 8.;

  informat ACCT_CURRENT_CHARGE_OFF_AMT 8.;

  informat pass_date mmddyy10.;

  format pass_date mmddyy10.;

  input POWER_ID RECORD_DATE default_reason cure_count

        ACCT_DISCLOSURE_STATUS_FLG PD_CD DEL_DAYS

        ACCT_CURRENT_CHARGE_OFF_AMT pass_date;

  cards;

0000196162 06/30/2013 . 3 P ** 0 0.00

0000196162 07/31/2013 . 4 P ** 0 0.00

0000196162 08/31/2013 . 5 P ** 0 0.00

0000196162 09/30/2013 . 6 P ** 0 0.00 09/30/2013

;

data have;

  retain ACCT_DISCLOSURE_STATUS_FLG PD_CD DEL_DAYS

         ACCT_CURRENT_CHARGE_OFF_AMT

         default_reason cure_count;

  set have;

run;

data have (drop=_Smiley Happy;

  format pass_date mmddyy10.;

  do until (last.power_id);

    set have (rename=(pass_date=_pass_date));

    by power_id;

    if first.power_id then call missing(pass_date);

    if not missing(_pass_date) then pass_date=_pass_date;

  end;

  do until (last.power_id);

    set have (rename=(pass_date=_pass_date));

    by power_id;

    output;

  end;

run;

proc transpose data=have out=tall ;

by power_id record_date;

var ACCT_DISCLOSURE_STATUS_FLG--cure_count;

copy pass_date;

run;

proc transpose data=tall out=want (where=(not missing(_name_)));

by power_id;

id _name_ record_date;

var col1;

copy pass_date;

run;

data want;

  set want (drop=_Smiley Happy;

run;

data need;

  set want (keep=pd_cd: rename=(pd_cd06_30_2013=pd_cd_06_30_2013

    pd_cd07_31_2013=pd_cd_07_31_2013

    pd_cd08_31_2013=pd_cd_08_31_2013

    pd_cd09_30_2013=pd_cd_09_30_2013 ));

run;

data want;

  set want;

  set need;

run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 200 views
  • 0 likes
  • 3 in conversation