BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sas1011
Calcite | Level 5

Hello SAS-ers,

 

I am trying to find the best way to get my desired output after I performed a proc transpose function.

 

My data consists of Member IDs with multiple dates(date of call by the member), service_date 

 

I transposed this data so that I get one row of rcord for each member, with the multiple dates in different columns.

 

Sample data:

 

ID         service_date           call_date1           call_date2         call_date3        call_date4         call_date5       Call_Total

1234       11/20/2019              11/20/2019           11/22/2019        11/24/2019        11/30/2019        12/03/2019               5

2345       11/12/2019              11/18/2019           11/19/2019        11/29/2019        12/06/2019        12/10/2019               5

4567       10/30/2019              11/07/2019           11/11/2019        11/23/2019                                                                   3

6789       10/17/2019              10/20/2019           11/01/2019                                                                                            2

7890       11/29/2019              12/03/2019           12/07/2019        12/08/2019        12/12/2019        12/19/2019              5

 

 

Now, if the Call_Total is more than 3, I want to update the first three call_datevariables to the most recent 3 dates(last 3 call_date). 

 

My output should look something like this: 

 

 ID        service_date        call_date1           call_date2         call_date3        call_date4         call_date5       Call_Total

1234      11/20/2019           11/24/2019            11/30/2019        12/03/2019                                                                  5

2345      11/12/2019           11/29/2019            12/06/2019        12/10/2019                                                                  5

4567      10/30/2019           11/07/2019            11/11/2019        11/23/2019                                                                   3

6789      10/17/2019           10/20/2019            11/01/2019                                                                                            2

7890      11/29/2019           12/08/2019            12/12/2019        12/19/2019                                                                  5

 

 

I can use the drop= function to later remove the call_date4 and call_date5 variables since I would expect those variables to be empoty after I acheive this result. 

 

I'm guessing I may have to use arrays? 

 

Any help is appreciated. if more information is needed, please let me know.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

FWIW 

 


data have;
input ID         (service_date  call_date1           call_date2         call_date3        call_date4         call_date5) (:mmddyy10.)       Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234       11/20/2019              11/20/2019           11/22/2019        11/24/2019        11/30/2019        12/03/2019               5
2345       11/12/2019              11/18/2019           11/19/2019        11/29/2019        12/06/2019        12/10/2019               5
4567       10/30/2019              11/07/2019           11/11/2019        11/23/2019              . .                                                     3
6789       10/17/2019              10/20/2019           11/01/2019          .                .        .                                                          2
7890       11/29/2019              12/03/2019           12/07/2019        12/08/2019        12/12/2019        12/19/2019              5
;
data _null_;
 set have;
 array c  call_date:;
 call symputx('n',dim(c));
 stop;
run;
data want;
set have;
array c  call_date:;
if call_total>3 then do;
 _k=call_total-3+1;
 do _n_=_k to _k+2;
  _i=sum(_i,1);
  c(_i)=c(_n_);
 end;
end;
drop call_date4-call_date&n _:;
run;

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I wouldn't do the transpose at all.

 

I would work on the UNtransposed data, run PROC RANK with BY MEMBER_ID and with the DESCENDING option, and then for cases where call_total>3, keep only those records with rank<=3. Here's an example: https://documentation.sas.com/?cdcId=pgmsascdc&cdcVersion=9.4_3.4&docsetId=proc&docsetTarget=p12aek9...

--
Paige Miller
PaigeMiller
Diamond | Level 26

But if you really want to work on the transposed data, CALL SORTN will do what you want, no arrays needed.

--
Paige Miller
sas1011
Calcite | Level 5

I needed the dates to be in one row. My initial dataset had multiple rows for the same ID, with the various call_dates.

PaigeMiller
Diamond | Level 26

@sas1011 wrote:

I needed the dates to be in one row. My initial dataset had multiple rows for the same ID, with the various call_dates.


As I said, if you need to do this after transposing, CALL SORTN works in a data step, no arrays needed.

--
Paige Miller
novinosrin
Tourmaline | Level 20

FWIW 

 


data have;
input ID         (service_date  call_date1           call_date2         call_date3        call_date4         call_date5) (:mmddyy10.)       Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234       11/20/2019              11/20/2019           11/22/2019        11/24/2019        11/30/2019        12/03/2019               5
2345       11/12/2019              11/18/2019           11/19/2019        11/29/2019        12/06/2019        12/10/2019               5
4567       10/30/2019              11/07/2019           11/11/2019        11/23/2019              . .                                                     3
6789       10/17/2019              10/20/2019           11/01/2019          .                .        .                                                          2
7890       11/29/2019              12/03/2019           12/07/2019        12/08/2019        12/12/2019        12/19/2019              5
;
data _null_;
 set have;
 array c  call_date:;
 call symputx('n',dim(c));
 stop;
run;
data want;
set have;
array c  call_date:;
if call_total>3 then do;
 _k=call_total-3+1;
 do _n_=_k to _k+2;
  _i=sum(_i,1);
  c(_i)=c(_n_);
 end;
end;
drop call_date4-call_date&n _:;
run;
sas1011
Calcite | Level 5

This worked for me. Thank you

novinosrin
Tourmaline | Level 20

Hi @sas1011   What the genie @PaigeMiller  meant was your dataset before transpose must have had the below structure

 


data have;
infile cards expandtabs;
input id call_date :mmddyy10.;
format call_date mmddyy10.;
cards;
1234	11/20/2019
1234	11/22/2019
1234	11/24/2019
1234	11/30/2019
1234	12/3/2019
2345	11/18/2019
2345	11/19/2019
2345	11/29/2019
2345	12/6/2019
2345	12/10/2019
4567	11/7/2019
4567	11/11/2019
4567	11/23/2019
4567	.
4567	.
6789	10/20/2019
6789	11/1/2019
7890	12/3/2019
7890	12/7/2019
7890	12/8/2019
7890	12/12/2019
7890	12/19/2019
;


 

The long and narrow structure is what SAS works at best or is dealt comfortably by users/procs and datastep too. I agree the rhythm to think outside of the box from a design standpoint upstream envisioning downstream would require that experience of Paige et al. The Proc rank and filter that Paige suggests is the below

 


proc rank data=have out=temp(where=(call_date_rank<=3)) ties=low descending;
   by id;
   var call_date;
   ranks call_date_rank;
run;

proc transpose data=temp(drop=call_date_rank) out=want prefix=call_date;
by id ;
var call_date;
run;

Hmm I think Miller Coors lite should make it Miller Coors strong for blokes like to have that powerful thinking :). HTH

r_behata
Barite | Level 11
data have;
input ID         (service_date  call_date1           call_date2         call_date3        call_date4         call_date5) (:mmddyy10.)       Call_Total;
format service_date--call_date5 mmddyy10.;
cards;
1234       11/20/2019              11/20/2019           11/22/2019        11/24/2019        11/30/2019        12/03/2019               5
2345       11/12/2019              11/18/2019           11/19/2019        11/29/2019        12/06/2019        12/10/2019               5
4567       10/30/2019              11/07/2019           11/11/2019        11/23/2019              . .                                                     3
6789       10/17/2019              10/20/2019           11/01/2019          .                .        .                                                          2
7890       11/29/2019              12/03/2019           12/07/2019        12/08/2019        12/12/2019        12/19/2019              5
;
run;

data want;
	set have;
	array dt call_date1-call_date5;

	if Call_Total > 3 then
		do;
			do i = 1 to 3;
				dt[i]=max(dt[i],dt[i+2]);
			end;
		end;

		drop call_date4 call_date5 i;
run;