Hello SAS-ers,
I am trying to find the best way to get my desired output after I performed a proc transpose function.
My data consists of Member IDs with multiple dates(date of call by the member), service_date
I transposed this data so that I get one row of rcord for each member, with the multiple dates in different columns.
Sample data:
ID service_date call_date1 call_date2 call_date3 call_date4 call_date5 Call_Total
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 3
6789 10/17/2019 10/20/2019 11/01/2019 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
Now, if the Call_Total is more than 3, I want to update the first three call_daten variables to the most recent 3 dates(last 3 call_date).
My output should look something like this:
ID service_date call_date1 call_date2 call_date3 call_date4 call_date5 Call_Total
1234 11/20/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 3
6789 10/17/2019 10/20/2019 11/01/2019 2
7890 11/29/2019 12/08/2019 12/12/2019 12/19/2019 5
I can use the drop= function to later remove the call_date4 and call_date5 variables since I would expect those variables to be empoty after I acheive this result.
I'm guessing I may have to use arrays?
Any help is appreciated. if more information is needed, please let me know.
FWIW
data have;
input ID (service_date call_date1 call_date2 call_date3 call_date4 call_date5) (:mmddyy10.) Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 . . 3
6789 10/17/2019 10/20/2019 11/01/2019 . . . 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
;
data _null_;
set have;
array c call_date:;
call symputx('n',dim(c));
stop;
run;
data want;
set have;
array c call_date:;
if call_total>3 then do;
_k=call_total-3+1;
do _n_=_k to _k+2;
_i=sum(_i,1);
c(_i)=c(_n_);
end;
end;
drop call_date4-call_date&n _:;
run;
I wouldn't do the transpose at all.
I would work on the UNtransposed data, run PROC RANK with BY MEMBER_ID and with the DESCENDING option, and then for cases where call_total>3, keep only those records with rank<=3. Here's an example: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=proc&docsetTarget=p12aek9...
But if you really want to work on the transposed data, CALL SORTN will do what you want, no arrays needed.
I needed the dates to be in one row. My initial dataset had multiple rows for the same ID, with the various call_dates.
@sas1011 wrote:
I needed the dates to be in one row. My initial dataset had multiple rows for the same ID, with the various call_dates.
As I said, if you need to do this after transposing, CALL SORTN works in a data step, no arrays needed.
FWIW
data have;
input ID (service_date call_date1 call_date2 call_date3 call_date4 call_date5) (:mmddyy10.) Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 . . 3
6789 10/17/2019 10/20/2019 11/01/2019 . . . 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
;
data _null_;
set have;
array c call_date:;
call symputx('n',dim(c));
stop;
run;
data want;
set have;
array c call_date:;
if call_total>3 then do;
_k=call_total-3+1;
do _n_=_k to _k+2;
_i=sum(_i,1);
c(_i)=c(_n_);
end;
end;
drop call_date4-call_date&n _:;
run;
This worked for me. Thank you
Hi @sas1011 What the genie @PaigeMiller meant was your dataset before transpose must have had the below structure
data have;
infile cards expandtabs;
input id call_date :mmddyy10.;
format call_date mmddyy10.;
cards;
1234 11/20/2019
1234 11/22/2019
1234 11/24/2019
1234 11/30/2019
1234 12/3/2019
2345 11/18/2019
2345 11/19/2019
2345 11/29/2019
2345 12/6/2019
2345 12/10/2019
4567 11/7/2019
4567 11/11/2019
4567 11/23/2019
4567 .
4567 .
6789 10/20/2019
6789 11/1/2019
7890 12/3/2019
7890 12/7/2019
7890 12/8/2019
7890 12/12/2019
7890 12/19/2019
;
The long and narrow structure is what SAS works at best or is dealt comfortably by users/procs and datastep too. I agree the rhythm to think outside of the box from a design standpoint upstream envisioning downstream would require that experience of Paige et al. The Proc rank and filter that Paige suggests is the below
proc rank data=have out=temp(where=(call_date_rank<=3)) ties=low descending;
by id;
var call_date;
ranks call_date_rank;
run;
proc transpose data=temp(drop=call_date_rank) out=want prefix=call_date;
by id ;
var call_date;
run;
Hmm I think Miller Coors lite should make it Miller Coors strong for blokes like to have that powerful thinking :). HTH
data have;
input ID (service_date call_date1 call_date2 call_date3 call_date4 call_date5) (:mmddyy10.) Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 . . 3
6789 10/17/2019 10/20/2019 11/01/2019 . . . 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
;
run;
data want;
set have;
array dt call_date1-call_date5;
if Call_Total > 3 then
do;
do i = 1 to 3;
dt[i]=max(dt[i],dt[i+2]);
end;
end;
drop call_date4 call_date5 i;
run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.