Solved: Re: Columns Transposition in SAS

Question · Posted 04-21-2017 06:17 AM

Hi All,

Hope you are well.

I have a quick question on how to program this :

I would like to have one row per customer, and have in columns, for example what was the value of the variable Number before the fisrt gap, during the first gap , after the first gap, before the second gap, during the second gap, after the second gap etc.. Basically I am trying to find out how big or small is this number when they transact and when they don't and does the value of the Number brings them back to transact etc...

Hope it makes sense, if not please do not hesitate to shout.

Thank you for your help

Have

customer_id	trans_date	week_start_date	week_end_date	WEEK_NO	no_trans	Number
50026	03-Jan-17	01-Jan-17	07-Jan-17	1	1	30,000
50026	03-Jan-17	01-Jan-17	07-Jan-17	1	.	21,000
50026	04-Jan-17	01-Jan-17	07-Jan-17	1	.	2,000
50026	06-Jan-17	01-Jan-17	07-Jan-17	1	1	31,000
50026	07-Jan-17	01-Jan-17	07-Jan-17	1	2	5,500
50026	10-Jan-17	08-Jan-17	14-Jan-17	2	2	40,000
50026	11-Jan-17	08-Jan-17	14-Jan-17	2	.	7,300
50026	13-Jan-17	08-Jan-17	14-Jan-17	2	1	53,000
50026	14-Jan-17	08-Jan-17	14-Jan-17	2	.	10,400

Want

customer_id	Number_ during_First_Gap	Number_ before_First_Gap	Number_ after_First_Gap	Number_ before_Second_Gap	Number_ during_Second_Gap	Number_ after_Second_Gap	Gap 3 etc…
50026	30,000	21,000 & 2000	31,000	40,000	7,300	53,000

Ksharp · Posted 04-22-2017 03:34 AM

Really not easy.

data have;
infile cards truncover expandtabs;
input (customer_id	trans_date	week_start_date	week_end_date) (:$20.)	WEEK_NO	no_trans	Number : comma32.;
cards;
50026	03-Jan-17	01-Jan-17	07-Jan-17	1	1	30,000
50026	03-Jan-17	01-Jan-17	07-Jan-17	1	.	21,000
50026	04-Jan-17	01-Jan-17	07-Jan-17	1	.	2,000
50026	06-Jan-17	01-Jan-17	07-Jan-17	1	1	31,000
50026	07-Jan-17	01-Jan-17	07-Jan-17	1	2	5,500
50026	10-Jan-17	08-Jan-17	14-Jan-17	2	2	40,000
50026	11-Jan-17	08-Jan-17	14-Jan-17	2	.	7,300
50026	13-Jan-17	08-Jan-17	14-Jan-17	2	1	53,000
50026	14-Jan-17	08-Jan-17	14-Jan-17	2	.	10,400
50027	03-Jan-17	01-Jan-17	07-Jan-17	1	.	21,000
50027	04-Jan-17	01-Jan-17	07-Jan-17	1	.	2,000
50027	06-Jan-17	01-Jan-17	07-Jan-17	1	1	31,000
;
run;
data temp1;
 set have;
 by customer_id no_trans notsorted;
 group+first.no_trans;
 if missing(no_trans);
run;
data temp2;
 set temp1;
 by customer_id group;
 if first.customer_id then n=0;
 n+first.group;
run;
data temp3;
length during $ 200;
 do until(last.n);
  set temp2;
  by customer_id n;
  during=catx('&',during,number);
 end;
keep during customer_id n;
run;


data temp;
 merge have have(firstobs=2 keep=customer_id number no_trans
 rename=(customer_id=_id number=_number no_trans=_trans));
 length name $ 20;
 if customer_id=_id and not missing(no_trans) and missing(_trans) then do;
  before=number;name='before';
 end;
 if customer_id=_id and missing(no_trans) and not missing(_trans) then do;
  after=_number;name='after';
 end;
 value=coalesce(before,after);
 if not missing(name);
 keep customer_id value name;
run;

data temp;
 set temp;
 by customer_id name notsorted;
 if first.customer_id then n=0;
 if name='before' then n+1;
run;
proc transpose data=temp out=temp33(keep=customer_id before after);
by customer_id n;
var value;
id name;
run;
data temp33;
 set temp33;
 by customer_id;
 if first.customer_id then n=0;
 n+1;
run;
data want;
 merge temp3 temp33;
 by customer_id n;
run;
proc sql noprint;
select max(n) into : n from want;
quit;
proc summary data=want ;
by customer_id;
output out=final_want idgroup(out[&n] (before during after)=);
run;

View solution in original post

RW9 · Posted 04-21-2017 07:10 AM

Hi,

First off, post test data in the form of a datastep.

Now sedonly I don't think that a transposed format is a good method, yes it may work on this small subset, but if there are thousands of records, The during list willl quickly run out of space, and there will be so many columns it wont be usable.

The format you have should be fine to work with, what is the problem? Just assign a code column, before, during, after, then you can proc freq/means or other calculation based on the flag. (can't post example as copy paste what you have provided is Excel or something - hence why post datastep).

Jim_G · Posted 04-21-2017 08:31 AM

Your report can be done in SAS no doubt. Some areas that are unusual in programming are: The number_before_first_gap is a variable length field depending on the number of times a gap occurs. Then the whole record is variable depending on the number of gaps per customer.

Some code that does not address these areas would be somthing like this;

proc sort; by customer week;

data; set; by customer week; retain hold1 hold2 hold3 seq;

gap=no_trans=.;

If first.week then do;

hold1=number; hold2=0; hold3=0; seq=1; end;

if gap and seq=1 then do;

during=hold1; seq+1; end;

if gap and seq gt 1 then do;

if seq=2 then hold2=number;

if seq=3 then hold3=number; end;

if not gap and seq gt 1 then do;

before= (string of hold2 and hold3); after=number; end;

Ksharp · Posted 04-22-2017 03:34 AM

Really not easy.

data have;
infile cards truncover expandtabs;
input (customer_id	trans_date	week_start_date	week_end_date) (:$20.)	WEEK_NO	no_trans	Number : comma32.;
cards;
50026	03-Jan-17	01-Jan-17	07-Jan-17	1	1	30,000
50026	03-Jan-17	01-Jan-17	07-Jan-17	1	.	21,000
50026	04-Jan-17	01-Jan-17	07-Jan-17	1	.	2,000
50026	06-Jan-17	01-Jan-17	07-Jan-17	1	1	31,000
50026	07-Jan-17	01-Jan-17	07-Jan-17	1	2	5,500
50026	10-Jan-17	08-Jan-17	14-Jan-17	2	2	40,000
50026	11-Jan-17	08-Jan-17	14-Jan-17	2	.	7,300
50026	13-Jan-17	08-Jan-17	14-Jan-17	2	1	53,000
50026	14-Jan-17	08-Jan-17	14-Jan-17	2	.	10,400
50027	03-Jan-17	01-Jan-17	07-Jan-17	1	.	21,000
50027	04-Jan-17	01-Jan-17	07-Jan-17	1	.	2,000
50027	06-Jan-17	01-Jan-17	07-Jan-17	1	1	31,000
;
run;
data temp1;
 set have;
 by customer_id no_trans notsorted;
 group+first.no_trans;
 if missing(no_trans);
run;
data temp2;
 set temp1;
 by customer_id group;
 if first.customer_id then n=0;
 n+first.group;
run;
data temp3;
length during $ 200;
 do until(last.n);
  set temp2;
  by customer_id n;
  during=catx('&',during,number);
 end;
keep during customer_id n;
run;


data temp;
 merge have have(firstobs=2 keep=customer_id number no_trans
 rename=(customer_id=_id number=_number no_trans=_trans));
 length name $ 20;
 if customer_id=_id and not missing(no_trans) and missing(_trans) then do;
  before=number;name='before';
 end;
 if customer_id=_id and missing(no_trans) and not missing(_trans) then do;
  after=_number;name='after';
 end;
 value=coalesce(before,after);
 if not missing(name);
 keep customer_id value name;
run;

data temp;
 set temp;
 by customer_id name notsorted;
 if first.customer_id then n=0;
 if name='before' then n+1;
run;
proc transpose data=temp out=temp33(keep=customer_id before after);
by customer_id n;
var value;
id name;
run;
data temp33;
 set temp33;
 by customer_id;
 if first.customer_id then n=0;
 n+1;
run;
data want;
 merge temp3 temp33;
 by customer_id n;
run;
proc sql noprint;
select max(n) into : n from want;
quit;
proc summary data=want ;
by customer_id;
output out=final_want idgroup(out[&n] (before during after)=);
run;

Question · Posted 04-22-2017 05:46 AM

Wow Ksharp,

Thank you so much for your help! It works perfecty 😊

##- Please type your reply above this line. Simple formatting, no
attachments. -##

Columns Transposition in SAS

Re: Columns Transposition in SAS

Re: Columns Transposition in SAS

Re: Columns Transposition in SAS

Re: Columns Transposition in SAS

Re: Columns Transposition in SAS

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away