BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hui_lee
Calcite | Level 5

Hi Experts,

 

Been trying to transpose (but by using arrays) my dataset below;

 

acctdateseqpaidcost
123Jan-10000
123Feb-10000
123Mar-10100
123Apr-10110010
123May-10115020

 

to the dataset below;

acctdatepaid_Jan2010paid_Feb2010paid_Mar2010paid_Apr2010paid_May2010cost_Jan2010cost_Feb2010cost_Mar2010cost_Apr2010cost_May2010
123Mar-100001001500001020

 

whereby only the row where the seq becomes 1 the first time is taken in the resulting dataset but with all the paid and cost values in it as well.

 

Is this achievable? Understand that using proc transpose might not work and have been playing around by using arrays but have been unsuccessful so far.

 

Any advise or help is much appreciated!

 

Thank you! 🙂

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I'm not going to question whether your desired result is or isn't a good idea, as we are all often asked to produce output in forms that we might not consider analytically ideal.

 

The answer to your question of "can it be done using arrays", the answer is yes. My solution uses an undocumented proc sql function, monotonic, thus SAS wouldn't support it always working as expected. In short, while it works, I wouldn't use the following code in a production environment. It would be safer to assign the format numbers in a data step .. but I leave that to you. I simply wrote some code to show that, yes, it can definitely be done:

 

data have;
  infile cards dlm=',';
  input acct	date monyy7.	seq	paid	cost;
  format date monyy7.;
  cards;
123,Jan-10,0,0,0
123,Feb-10,0,0,0
123,Mar-10,1,0,0
123,Apr-10,1,100,10
123,May-10,1,150,20
223,Jan-10,0,0,0
223,Feb-10,0,0,0
223,Mar-10,0,20,5
223,Apr-10,1,30,10
223,May-10,1,40,15
;

proc sql noprint;
  select catt('cost_',put(date,monyy7.)),
         catt('paid_',put(date,monyy7.)),
         catt(date,'=',monotonic())
    into :costs separated by ' ',
         :paids separated by ' ',
         :for4mat separated by ' '
      from (select distinct date from have)
        order by date
  ;
quit;

proc format;
  value dates
  &for4mat.
  ;
run;

data want (drop=_: seq paid cost);
  format acct 8.;
  format date monyy7.;
  set have (rename=(date=_date));
  by acct seq;
  array costs(*) &costs.;
  array paids(*) &paids.;
  retain date costs paids;
  if first.acct then do;
    call missing(date);
    call missing(of costs(*));
    call missing(of paids(*));
  end;
  if seq eq 1 then date=coalesce(date,_date);;
  costs(put(_date,dates.))=cost;
  paids(put(_date,dates.))=paid;
  if last.acct then output;
run;

Art, CEO, AnalystFinder.com

View solution in original post

7 REPLIES 7
Astounding
PROC Star

Advice:  don't do it.  The resulting data is not useful for reporting purposes, since it will be extremely wide when different ACCT values have different date ranges.  And it isn't particularly useful for analysis purposes either.  There are plenty of techniques for working with your data as is.  If you post more about where you are headed, and what you would like to ultimately accomplish with this data, there are sure to be other approaches available.

Ksharp
Super User

Check the MERGE skill proposed by Me,Arthur.T and Matt.

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

 

data have;
infile cards truncover expandtabs;
input acct	date : monyy7.	seq	paid	cost;
format date monyy7.;
cards;
123	Jan-10	0	0	0
123	Feb-10	0	0	0
123	Mar-10	1	0	0
123	Apr-10	1	100	10
123	May-10	1	150	20
223	Jan-10	0	0	0
223	Feb-10	0	0	0
223	Mar-10	1	0	0
223	Apr-10	1	100	10
223	May-10	1	150	20
;
run;
proc sql;
create table temp as
 select distinct date from have;
quit;
data _null_;
 set temp end=last;
 if _n_=1 then call execute('data want;merge ');
 call execute(catt('have(rename=(paid=paid_',put(date,monyy7.),
 ' cost=cost_',put(date,monyy7.),') where=(date=',date,'))'));
 if last then call execute(';by acct; drop date seq;run;');
run;
Reeza
Super User

I second @Astounding answer. This format will be more cumbersome to deal with in the long run. 

art297
Opal | Level 21

I'm not going to question whether your desired result is or isn't a good idea, as we are all often asked to produce output in forms that we might not consider analytically ideal.

 

The answer to your question of "can it be done using arrays", the answer is yes. My solution uses an undocumented proc sql function, monotonic, thus SAS wouldn't support it always working as expected. In short, while it works, I wouldn't use the following code in a production environment. It would be safer to assign the format numbers in a data step .. but I leave that to you. I simply wrote some code to show that, yes, it can definitely be done:

 

data have;
  infile cards dlm=',';
  input acct	date monyy7.	seq	paid	cost;
  format date monyy7.;
  cards;
123,Jan-10,0,0,0
123,Feb-10,0,0,0
123,Mar-10,1,0,0
123,Apr-10,1,100,10
123,May-10,1,150,20
223,Jan-10,0,0,0
223,Feb-10,0,0,0
223,Mar-10,0,20,5
223,Apr-10,1,30,10
223,May-10,1,40,15
;

proc sql noprint;
  select catt('cost_',put(date,monyy7.)),
         catt('paid_',put(date,monyy7.)),
         catt(date,'=',monotonic())
    into :costs separated by ' ',
         :paids separated by ' ',
         :for4mat separated by ' '
      from (select distinct date from have)
        order by date
  ;
quit;

proc format;
  value dates
  &for4mat.
  ;
run;

data want (drop=_: seq paid cost);
  format acct 8.;
  format date monyy7.;
  set have (rename=(date=_date));
  by acct seq;
  array costs(*) &costs.;
  array paids(*) &paids.;
  retain date costs paids;
  if first.acct then do;
    call missing(date);
    call missing(of costs(*));
    call missing(of paids(*));
  end;
  if seq eq 1 then date=coalesce(date,_date);;
  costs(put(_date,dates.))=cost;
  paids(put(_date,dates.))=paid;
  if last.acct then output;
run;

Art, CEO, AnalystFinder.com

Tom
Super User Tom
Super User

Do you know the list of months in advance? If not it is much harder.

So first make two arrays to represent the two groups of variables that are used to store the COST and PAID values.

Then just use a new variable to store the first date where SEQ=1.

So assuming that all accounts have all months you could do this.

data have ;
  input acct $ date :anydtdte. seq paid cost ;
  format date monyy7. ;
cards;
123 Jan-10 0 0 0
123 Feb-10 0 0 0
123 Mar-10 1 0 0
123 Apr-10 1 100 10
123 May-10 1 150 20
;
proc print;
run;

data want ;
  do i=1 by 1 until(last.acct);
     set have ;
     by acct ;
     if seq=1 then first_date=coalesce(first_date,date);
     array p PAID_JAN2010 PAID_FEB2010 PAID_MAR2010 PAID_APR2010 PAID_MAY2010 ;
     array c COST_JAN2010 COST_FEB2010 COST_MAR2010 COST_APR2010 COST_MAY2010 ;
     p(i) = paid;
     c(i) = cost;
  end;
  format first_date monyy7.;
  keep acct first_date paid_: cost_:;
run;
proc print;
run;

 

art297
Opal | Level 21

@Tom: We both took similar appoaches (I modified mine to include the way you handled date as your method was better). The difference between the approaches is that mine doesn't require one to know array size (or values) in advance, and not all dates have to exist for all records. If they all do exist then, yes, your approach is more parsimonious. However, if either of those conditions don't hold, then I'd prefer my approach.

 

Regardless, both provide methods that should help folks learn how do to such things.

 

Art, CEO, AnalystFinder.com

hui_lee
Calcite | Level 5

Dear all,

 

Thank you very much for all the responses. It has been very helpful! 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 4519 views
  • 3 likes
  • 6 in conversation