Hello SAS-ers,
I am trying to find the best way to get my desired output after I performed a proc transpose function.
My data consists of Member IDs with multiple dates(date of call by the member), service_date
I transposed this data so that I get one row of rcord for each member, with the multiple dates in different columns.
Sample data:
ID service_date call_date1 call_date2 call_date3 call_date4 call_date5 Call_Total
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 3
6789 10/17/2019 10/20/2019 11/01/2019 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
Now, if the Call_Total is more than 3, I want to update the first three call_daten variables to the most recent 3 dates(last 3 call_date).
My output should look something like this:
ID service_date call_date1 call_date2 call_date3 call_date4 call_date5 Call_Total
1234 11/20/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 3
6789 10/17/2019 10/20/2019 11/01/2019 2
7890 11/29/2019 12/08/2019 12/12/2019 12/19/2019 5
I can use the drop= function to later remove the call_date4 and call_date5 variables since I would expect those variables to be empoty after I acheive this result.
I'm guessing I may have to use arrays?
Any help is appreciated. if more information is needed, please let me know.
FWIW
data have;
input ID (service_date call_date1 call_date2 call_date3 call_date4 call_date5) (:mmddyy10.) Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 . . 3
6789 10/17/2019 10/20/2019 11/01/2019 . . . 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
;
data _null_;
set have;
array c call_date:;
call symputx('n',dim(c));
stop;
run;
data want;
set have;
array c call_date:;
if call_total>3 then do;
_k=call_total-3+1;
do _n_=_k to _k+2;
_i=sum(_i,1);
c(_i)=c(_n_);
end;
end;
drop call_date4-call_date&n _:;
run;
I wouldn't do the transpose at all.
I would work on the UNtransposed data, run PROC RANK with BY MEMBER_ID and with the DESCENDING option, and then for cases where call_total>3, keep only those records with rank<=3. Here's an example: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=proc&docsetTarget=p12aek9...
But if you really want to work on the transposed data, CALL SORTN will do what you want, no arrays needed.
I needed the dates to be in one row. My initial dataset had multiple rows for the same ID, with the various call_dates.
@sas1011 wrote:
I needed the dates to be in one row. My initial dataset had multiple rows for the same ID, with the various call_dates.
As I said, if you need to do this after transposing, CALL SORTN works in a data step, no arrays needed.
FWIW
data have;
input ID (service_date call_date1 call_date2 call_date3 call_date4 call_date5) (:mmddyy10.) Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 . . 3
6789 10/17/2019 10/20/2019 11/01/2019 . . . 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
;
data _null_;
set have;
array c call_date:;
call symputx('n',dim(c));
stop;
run;
data want;
set have;
array c call_date:;
if call_total>3 then do;
_k=call_total-3+1;
do _n_=_k to _k+2;
_i=sum(_i,1);
c(_i)=c(_n_);
end;
end;
drop call_date4-call_date&n _:;
run;
This worked for me. Thank you
Hi @sas1011 What the genie @PaigeMiller meant was your dataset before transpose must have had the below structure
data have;
infile cards expandtabs;
input id call_date :mmddyy10.;
format call_date mmddyy10.;
cards;
1234 11/20/2019
1234 11/22/2019
1234 11/24/2019
1234 11/30/2019
1234 12/3/2019
2345 11/18/2019
2345 11/19/2019
2345 11/29/2019
2345 12/6/2019
2345 12/10/2019
4567 11/7/2019
4567 11/11/2019
4567 11/23/2019
4567 .
4567 .
6789 10/20/2019
6789 11/1/2019
7890 12/3/2019
7890 12/7/2019
7890 12/8/2019
7890 12/12/2019
7890 12/19/2019
;
The long and narrow structure is what SAS works at best or is dealt comfortably by users/procs and datastep too. I agree the rhythm to think outside of the box from a design standpoint upstream envisioning downstream would require that experience of Paige et al. The Proc rank and filter that Paige suggests is the below
proc rank data=have out=temp(where=(call_date_rank<=3)) ties=low descending;
by id;
var call_date;
ranks call_date_rank;
run;
proc transpose data=temp(drop=call_date_rank) out=want prefix=call_date;
by id ;
var call_date;
run;
Hmm I think Miller Coors lite should make it Miller Coors strong for blokes like to have that powerful thinking :). HTH
data have;
input ID (service_date call_date1 call_date2 call_date3 call_date4 call_date5) (:mmddyy10.) Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234 11/20/2019 11/20/2019 11/22/2019 11/24/2019 11/30/2019 12/03/2019 5
2345 11/12/2019 11/18/2019 11/19/2019 11/29/2019 12/06/2019 12/10/2019 5
4567 10/30/2019 11/07/2019 11/11/2019 11/23/2019 . . 3
6789 10/17/2019 10/20/2019 11/01/2019 . . . 2
7890 11/29/2019 12/03/2019 12/07/2019 12/08/2019 12/12/2019 12/19/2019 5
;
run;
data want;
set have;
array dt call_date1-call_date5;
if Call_Total > 3 then
do;
do i = 1 to 3;
dt[i]=max(dt[i],dt[i+2]);
end;
end;
drop call_date4 call_date5 i;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.