Hi All,
Hope you are well.
I have a quick question on how to program this :
I would like to have one row per customer, and have in columns, for example what was the value of the variable Number before the fisrt gap, during the first gap , after the first gap, before the second gap, during the second gap, after the second gap etc.. Basically I am trying to find out how big or small is this number when they transact and when they don't and does the value of the Number brings them back to transact etc...
Hope it makes sense, if not please do not hesitate to shout.
Thank you for your help
Have
customer_id | trans_date | week_start_date | week_end_date | WEEK_NO | no_trans | Number |
50026 | 03-Jan-17 | 01-Jan-17 | 07-Jan-17 | 1 | 1 | 30,000 |
50026 | 03-Jan-17 | 01-Jan-17 | 07-Jan-17 | 1 | . | 21,000 |
50026 | 04-Jan-17 | 01-Jan-17 | 07-Jan-17 | 1 | . | 2,000 |
50026 | 06-Jan-17 | 01-Jan-17 | 07-Jan-17 | 1 | 1 | 31,000 |
50026 | 07-Jan-17 | 01-Jan-17 | 07-Jan-17 | 1 | 2 | 5,500 |
50026 | 10-Jan-17 | 08-Jan-17 | 14-Jan-17 | 2 | 2 | 40,000 |
50026 | 11-Jan-17 | 08-Jan-17 | 14-Jan-17 | 2 | . | 7,300 |
50026 | 13-Jan-17 | 08-Jan-17 | 14-Jan-17 | 2 | 1 | 53,000 |
50026 | 14-Jan-17 | 08-Jan-17 | 14-Jan-17 | 2 | . | 10,400 |
Want
customer_id | Number_ during_First_Gap |
Number_ before_First_Gap |
Number_ after_First_Gap |
Number_ before_Second_Gap |
Number_ during_Second_Gap |
Number_ after_Second_Gap |
Gap 3 etc… |
50026 | 30,000 | 21,000 & 2000 | 31,000 | 40,000 | 7,300 | 53,000 |
Really not easy.
data have;
infile cards truncover expandtabs;
input (customer_id trans_date week_start_date week_end_date) (:$20.) WEEK_NO no_trans Number : comma32.;
cards;
50026 03-Jan-17 01-Jan-17 07-Jan-17 1 1 30,000
50026 03-Jan-17 01-Jan-17 07-Jan-17 1 . 21,000
50026 04-Jan-17 01-Jan-17 07-Jan-17 1 . 2,000
50026 06-Jan-17 01-Jan-17 07-Jan-17 1 1 31,000
50026 07-Jan-17 01-Jan-17 07-Jan-17 1 2 5,500
50026 10-Jan-17 08-Jan-17 14-Jan-17 2 2 40,000
50026 11-Jan-17 08-Jan-17 14-Jan-17 2 . 7,300
50026 13-Jan-17 08-Jan-17 14-Jan-17 2 1 53,000
50026 14-Jan-17 08-Jan-17 14-Jan-17 2 . 10,400
50027 03-Jan-17 01-Jan-17 07-Jan-17 1 . 21,000
50027 04-Jan-17 01-Jan-17 07-Jan-17 1 . 2,000
50027 06-Jan-17 01-Jan-17 07-Jan-17 1 1 31,000
;
run;
data temp1;
set have;
by customer_id no_trans notsorted;
group+first.no_trans;
if missing(no_trans);
run;
data temp2;
set temp1;
by customer_id group;
if first.customer_id then n=0;
n+first.group;
run;
data temp3;
length during $ 200;
do until(last.n);
set temp2;
by customer_id n;
during=catx('&',during,number);
end;
keep during customer_id n;
run;
data temp;
merge have have(firstobs=2 keep=customer_id number no_trans
rename=(customer_id=_id number=_number no_trans=_trans));
length name $ 20;
if customer_id=_id and not missing(no_trans) and missing(_trans) then do;
before=number;name='before';
end;
if customer_id=_id and missing(no_trans) and not missing(_trans) then do;
after=_number;name='after';
end;
value=coalesce(before,after);
if not missing(name);
keep customer_id value name;
run;
data temp;
set temp;
by customer_id name notsorted;
if first.customer_id then n=0;
if name='before' then n+1;
run;
proc transpose data=temp out=temp33(keep=customer_id before after);
by customer_id n;
var value;
id name;
run;
data temp33;
set temp33;
by customer_id;
if first.customer_id then n=0;
n+1;
run;
data want;
merge temp3 temp33;
by customer_id n;
run;
proc sql noprint;
select max(n) into : n from want;
quit;
proc summary data=want ;
by customer_id;
output out=final_want idgroup(out[&n] (before during after)=);
run;
Hi,
First off, post test data in the form of a datastep.
Now sedonly I don't think that a transposed format is a good method, yes it may work on this small subset, but if there are thousands of records, The during list willl quickly run out of space, and there will be so many columns it wont be usable.
The format you have should be fine to work with, what is the problem? Just assign a code column, before, during, after, then you can proc freq/means or other calculation based on the flag. (can't post example as copy paste what you have provided is Excel or something - hence why post datastep).
Your report can be done in SAS no doubt. Some areas that are unusual in programming are: The number_before_first_gap is a variable length field depending on the number of times a gap occurs. Then the whole record is variable depending on the number of gaps per customer.
Some code that does not address these areas would be somthing like this;
proc sort; by customer week;
data; set; by customer week; retain hold1 hold2 hold3 seq;
gap=no_trans=.;
If first.week then do;
hold1=number; hold2=0; hold3=0; seq=1; end;
if gap and seq=1 then do;
during=hold1; seq+1; end;
if gap and seq gt 1 then do;
if seq=2 then hold2=number;
if seq=3 then hold3=number; end;
if not gap and seq gt 1 then do;
before= (string of hold2 and hold3); after=number; end;
Really not easy.
data have;
infile cards truncover expandtabs;
input (customer_id trans_date week_start_date week_end_date) (:$20.) WEEK_NO no_trans Number : comma32.;
cards;
50026 03-Jan-17 01-Jan-17 07-Jan-17 1 1 30,000
50026 03-Jan-17 01-Jan-17 07-Jan-17 1 . 21,000
50026 04-Jan-17 01-Jan-17 07-Jan-17 1 . 2,000
50026 06-Jan-17 01-Jan-17 07-Jan-17 1 1 31,000
50026 07-Jan-17 01-Jan-17 07-Jan-17 1 2 5,500
50026 10-Jan-17 08-Jan-17 14-Jan-17 2 2 40,000
50026 11-Jan-17 08-Jan-17 14-Jan-17 2 . 7,300
50026 13-Jan-17 08-Jan-17 14-Jan-17 2 1 53,000
50026 14-Jan-17 08-Jan-17 14-Jan-17 2 . 10,400
50027 03-Jan-17 01-Jan-17 07-Jan-17 1 . 21,000
50027 04-Jan-17 01-Jan-17 07-Jan-17 1 . 2,000
50027 06-Jan-17 01-Jan-17 07-Jan-17 1 1 31,000
;
run;
data temp1;
set have;
by customer_id no_trans notsorted;
group+first.no_trans;
if missing(no_trans);
run;
data temp2;
set temp1;
by customer_id group;
if first.customer_id then n=0;
n+first.group;
run;
data temp3;
length during $ 200;
do until(last.n);
set temp2;
by customer_id n;
during=catx('&',during,number);
end;
keep during customer_id n;
run;
data temp;
merge have have(firstobs=2 keep=customer_id number no_trans
rename=(customer_id=_id number=_number no_trans=_trans));
length name $ 20;
if customer_id=_id and not missing(no_trans) and missing(_trans) then do;
before=number;name='before';
end;
if customer_id=_id and missing(no_trans) and not missing(_trans) then do;
after=_number;name='after';
end;
value=coalesce(before,after);
if not missing(name);
keep customer_id value name;
run;
data temp;
set temp;
by customer_id name notsorted;
if first.customer_id then n=0;
if name='before' then n+1;
run;
proc transpose data=temp out=temp33(keep=customer_id before after);
by customer_id n;
var value;
id name;
run;
data temp33;
set temp33;
by customer_id;
if first.customer_id then n=0;
n+1;
run;
data want;
merge temp3 temp33;
by customer_id n;
run;
proc sql noprint;
select max(n) into : n from want;
quit;
proc summary data=want ;
by customer_id;
output out=final_want idgroup(out[&n] (before during after)=);
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.