Hello Community,
Is it something that could be done using SAS?
I have a table like this:
Customer  | Start_Date  | Rental_Date  | Grace_Period  | MovieTitle  | Movie1  | Movie2  | Movies_List  | 
12345  | 13-Mar-13  | 13-Mar-13  | 30  | Gone Girl  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 13-Mar-13  | 30  | Titanic  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 8-Apr-13  | 30  | Gone Girl  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 8-Apr-13  | 30  | Titanic  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 4-May-13  | 30  | Gone Girl  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 4-May-13  | 30  | Titanic  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 11-Feb-15  | 30  | Hunger Games  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
12345  | 13-Mar-13  | 11-Feb-15  | 30  | Titanic  | Gone Girl  | Titanic  | Gone Girl,Titanic  | 
67890  | 10-Aug-14  | 10-Aug-14  | 30  | Pitch Perfect  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 8-Sep-14  | 30  | Pitch Perfect  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 17-Oct-14  | 30  | Pitch Perfect  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 29-Nov-14  | 30  | Pitch Perfect  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 31-Dec-14  | 30  | Booksmart  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 10-Feb-15  | 30  | Booksmart  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 22-May-15  | 30  | Booksmart  | Pitch Perfect  | 
  | Pitch Perfect  | 
67890  | 10-Aug-14  | 29-Jun-15  | 30  | Booksmart  | Pitch Perfect  | 
  | Pitch Perfect  | 
… where there are many rows per 1 customer and the table records the dates (Rental_Date) and the movies (MovieTitle) the customer rented on those dates. Sometimes a customer rents more than 1 movie, ex: customer 12345. This customer rented two movies on March 13, 2013.
What I want to do is to create a new column called “Switch_Status” that records whether the customer switched movies after whatever their initial movie/movies selection was. For example, customer 12345 initially rented 2 movies: Gone Girl and Titanic. I want the new column to record “Switch_Status” = “yes” because in the time Rental_Date+Grace_Period, they eventually stopped renting Gone Girl and Titanic, and instead rented Hunger Games and Titanic (February 11, 2015).
The second example customer 67890 is easier to record. Here, this customer initially rented Pitch Perfect and then after Rental_Date+Grace Period, there is evidence that they switched the movie they initially rented (Pitch Perfect), and instead rented Booksmart. So for this customer also, “Switch_Status” would be “yes”.
And obviously, if there is no evidence of switch, I want “Switch_Status” to equal “no”.
Is there any easy way to do this?
Thank you!
Sure, this is doable but I'd say 2 years out is really not within any grace period...
It would help if you showed a smaller, realistic example and the expected outcome. A description goes only so far, whereas showing the expected output for an input data set really helps to illustrate what you're trying to accomplish.
@iced_tea wrote:
Hello Community,
Is it something that could be done using SAS?
I have a table like this:
Customer
Start_Date
Rental_Date
Grace_Period
MovieTitle
Movie1
Movie2
Movies_List
12345
13-Mar-13
13-Mar-13
30
Gone Girl
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
13-Mar-13
30
Titanic
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
8-Apr-13
30
Gone Girl
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
8-Apr-13
30
Titanic
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
4-May-13
30
Gone Girl
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
4-May-13
30
Titanic
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
11-Feb-15
30
Hunger Games
Gone Girl
Titanic
Gone Girl,Titanic
12345
13-Mar-13
11-Feb-15
30
Titanic
Gone Girl
Titanic
Gone Girl,Titanic
67890
10-Aug-14
10-Aug-14
30
Pitch Perfect
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
8-Sep-14
30
Pitch Perfect
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
17-Oct-14
30
Pitch Perfect
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
29-Nov-14
30
Pitch Perfect
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
31-Dec-14
30
Booksmart
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
10-Feb-15
30
Booksmart
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
22-May-15
30
Booksmart
Pitch Perfect
Pitch Perfect
67890
10-Aug-14
29-Jun-15
30
Booksmart
Pitch Perfect
Pitch Perfect
… where there are many rows per 1 customer and the table records the dates (Rental_Date) and the movies (MovieTitle) the customer rented on those dates. Sometimes a customer rents more than 1 movie, ex: customer 12345. This customer rented two movies on March 13, 2013.
What I want to do is to create a new column called “Switch_Status” that records whether the customer switched movies after whatever their initial movie/movies selection was. For example, customer 12345 initially rented 2 movies: Gone Girl and Titanic. I want the new column to record “Switch_Status” = “yes” because in the time Rental_Date+Grace_Period, they eventually stopped renting Gone Girl and Titanic, and instead rented Hunger Games and Titanic (February 11, 2015).
The second example customer 67890 is easier to record. Here, this customer initially rented Pitch Perfect and then after Rental_Date+Grace Period, there is evidence that they switched the movie they initially rented (Pitch Perfect), and instead rented Booksmart. So for this customer also, “Switch_Status” would be “yes”.
And obviously, if there is no evidence of switch, I want “Switch_Status” to equal “no”.
Is there any easy way to do this?
Thank you!
I'm wouldn't say this is easy, but it can be done. Here are the results, below, and at the very bottom, the SAS code. Note: I changed the last two rental dates for customer 12345 so that they would be within the 30 day grace period.
Is this what you were looking for?
Jim
DATA	Cust_Movie_Data_Switch;
	DROP	_:;
	SET		Cust_Movie_Data;
		BY	Customer	Rental_Date;
	ARRAY	Prior_Movies	[*]		$32	_Prior_Movie1	_Prior_Movie2;
	FORMAT	_Prior_Rental_Date		DATE9.;
	FORMAT	_Prior2_Rental_Date		DATE9.;
	FORMAT	_Temp_Date				DATE9.;
	LENGTH	_Prior_Movie1			$32;
	LENGTH	_Prior_Movie2			$32;
	LENGTH	_Prior2_Movie1			$32;
	LENGTH	_Prior2_Movie2			$32;
	LENGTH	_Temp1					$32;
	LENGTH	_Temp2					$32;
	Switch_Status					=	'No ';
	_Prior_Customer					=	LAG(Customer);
	_Prior2_Customer				=	LAG2(Customer);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Prior2_Rental_Date				=	LAG2(Rental_Date);
	_Prior_Movie1					=	LAG(Movie1);
	_Prior_Movie2					=	LAG(Movie2);
	_Prior2_Movie1					=	LAG2(Movie1);
	_Prior2_Movie2					=	LAG2(Movie2);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Temp1							=	Prior_Movies[1];
	_Temp2							=	Prior_Movies[2];
	IF	FIRST.Customer																	THEN
		DO;
			CALL	MISSING(		_Prior_Customer
									,_Prior_Rental_Date
									,_Prior2_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2
									,_Prior_Rental_Date);
		END;
	
	IF	Rental_Date					=	_Prior_Rental_Date								THEN
		IF	_Prior2_Customer		=	Customer										THEN
			DO;
				_Prior_Rental_Date	=	_Prior2_Rental_Date;
				_Prior_Movie1		=	_Prior2_Movie1;
				_Prior_Movie2		=	_Prior2_Movie2;
			END;
		ELSE
			DO;
				CALL	MISSING(	_Prior_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2);
			END;
	IF	NOT	MISSING(_Prior_Rental_Date)													THEN
		_Temp_Date					=	INTNX('DAYS', _Prior_Rental_Date, Grace_Period);
	IF	NOT	MISSING(_Prior_Rental_Date)													AND
		Rental_Date					<	INTNX('DAYS', _Prior_Rental_Date, Grace_Period)	THEN
		IF	Movie1				NOT	IN	Prior_Movies									OR
			Movie2				NOT	IN	Prior_Movies									THEN
			DO;
				Switch_Status		=	'Yes';
			END;
RUN;
					
				
			
			
				
			
			
			
			
			
			
			
		Thank you Jim. I really appreciate your taking a look.
The only thing is that - in the case of the last two rental dates for customer 12345 being way outside the 30 days, I still would want this customer to be flagged as "Switch_Status" = "yes".
My apologies. I might have not made it clear. So, if the customer has a different movie/movies within Rental_Date+Grace_Period, then it's no switch. It is a switch if they have different movie/movies AFTER Rental_Date+Grace_Period. Does that make sense?
Ah, ah, I see. I misunderstood. I assumed the opposite, that within the grace period was considered a switch.
Let me reverse the logic.
Jim
Yes, all I had to do was reverse the less than/greater than symbol. Now record 13 also shows as a switch ("Yes"). Record 13 is a day or two past the grace period.
Results and code are below.
Jim
DATA	Cust_Movie_Data;
	INFILE	DATALINES
		DSD	DLM='09'x	MISSOVER;
	INPUT	Customer		$
			Start_Date		:	ANYDTDTE10.
			Rental_Date		:	ANYDTDTE10.
			Grace_Period	
			MovieTitle		:	$32.
			Movie1			:	$32.
			Movie2			:	$32.
			Movies_List		:	$64.
			;
	FORMAT	Start_Date			DATE9.;
	FORMAT	Rental_Date			DATE9.;
DATALINES;
12345	13-Mar-13	13-Mar-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	13-Mar-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	11-Feb_15	30	Hunger Games	Hunger Games	Titanic	Hunger Games,Titanic
12345	13-Mar-13	11-Feb_15	30	Titanic	Hunger Games	Titanic	Hunger Games,Titanic
67890	10-Aug-14	10-Aug-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	8-Sep-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	17-Oct-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	29-Nov-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	31-Dec-14	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	10-Feb-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	22-May-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	29-Jun-15	30	Booksmart	Booksmart	Booksmart
;
RUN;
PROC	SORT		DATA=Cust_Movie_Data;
	BY	Customer	Rental_Date;
RUN;
DATA	Cust_Movie_Data_Switch;
	DROP	_:;
	SET		Cust_Movie_Data;
		BY	Customer	Rental_Date;
	ARRAY	Prior_Movies	[*]		$32	_Prior_Movie1	_Prior_Movie2;
	FORMAT	_Prior_Rental_Date		DATE9.;
	FORMAT	_Prior2_Rental_Date		DATE9.;
	FORMAT	_Temp_Date				DATE9.;
	LENGTH	_Prior_Movie1			$32;
	LENGTH	_Prior_Movie2			$32;
	LENGTH	_Prior2_Movie1			$32;
	LENGTH	_Prior2_Movie2			$32;
	LENGTH	_Temp1					$32;
	LENGTH	_Temp2					$32;
	Switch_Status					=	'No ';
	_Prior_Customer					=	LAG(Customer);
	_Prior2_Customer				=	LAG2(Customer);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Prior2_Rental_Date				=	LAG2(Rental_Date);
	_Prior_Movie1					=	LAG(Movie1);
	_Prior_Movie2					=	LAG(Movie2);
	_Prior2_Movie1					=	LAG2(Movie1);
	_Prior2_Movie2					=	LAG2(Movie2);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Temp1							=	Prior_Movies[1];
	_Temp2							=	Prior_Movies[2];
	IF	FIRST.Customer																	THEN
		DO;
			CALL	MISSING(		_Prior_Customer
									,_Prior_Rental_Date
									,_Prior2_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2
									,_Prior_Rental_Date);
		END;
	
	IF	Rental_Date					=	_Prior_Rental_Date								THEN
		IF	_Prior2_Customer		=	Customer										THEN
			DO;
				_Prior_Rental_Date	=	_Prior2_Rental_Date;
				_Prior_Movie1		=	_Prior2_Movie1;
				_Prior_Movie2		=	_Prior2_Movie2;
			END;
		ELSE
			DO;
				CALL	MISSING(	_Prior_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2);
			END;
	IF	NOT	MISSING(_Prior_Rental_Date)													THEN
		_Temp_Date					=	INTNX('DAYS', _Prior_Rental_Date, Grace_Period);
	IF	NOT	MISSING(_Prior_Rental_Date)													AND
		Rental_Date					>	INTNX('DAYS', _Prior_Rental_Date, Grace_Period)	THEN
		IF	Movie1				NOT	IN	Prior_Movies									OR
			Movie2				NOT	IN	Prior_Movies									THEN
			DO;
				Switch_Status		=	'Yes';
			END;
RUN;
					
				
			
			
				
			
			
			
			
			
			
			
		
Yes, probably the better option. With such a small dataset, I just threw in a SORT, but perhaps that's a bit lazy on my part. 😳
Jim
Interesting how it's working for you. I see the output you got. That's exactly what I'd like. But for me, even running the code to create Cust_Movie_Data only produces an empty table.
I tried reworking your code for my real data, and everyone ends up being Switch_Status = No, even though it is obvious some should be 'Yes', just by looking at their data row by row.
That's very very strange. You cut in paste every line of code even the DATALINES?
Are you using SAS Studio? Display Manager? Enterprise Guide? Which one are you using?
Try closing your session, opening a brand new session and cutting and pasting in the below code. I changed the DATALINES back to February 2015, so it's important to get these DATALINES, not the ones from before where I had the requirements backwards.
Jim
DATA	Cust_Movie_Data;
	INFILE	DATALINES
		DSD	DLM='09'x	MISSOVER;
	INPUT	Customer		$
			Start_Date		:	ANYDTDTE10.
			Rental_Date		:	ANYDTDTE10.
			Grace_Period	
			MovieTitle		:	$32.
			Movie1			:	$32.
			Movie2			:	$32.
			Movies_List		:	$64.
			;
	FORMAT	Start_Date			DATE9.;
	FORMAT	Rental_Date			DATE9.;
DATALINES;
12345	13-Mar-13	13-Mar-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	13-Mar-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	11-Feb_15	30	Hunger Games	Hunger Games	Titanic	Hunger Games,Titanic
12345	13-Mar-13	11-Feb_15	30	Titanic	Hunger Games	Titanic	Hunger Games,Titanic
67890	10-Aug-14	10-Aug-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	8-Sep-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	17-Oct-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	29-Nov-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	31-Dec-14	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	10-Feb-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	22-May-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	29-Jun-15	30	Booksmart	Booksmart	Booksmart
;
RUN;
PROC	SORT		DATA=Cust_Movie_Data;
	BY	Customer	Rental_Date;
RUN;
DATA	Cust_Movie_Data_Switch;
	DROP	_:;
	SET		Cust_Movie_Data;
		BY	Customer	Rental_Date;
	ARRAY	Prior_Movies	[*]		$32	_Prior_Movie1	_Prior_Movie2;
	FORMAT	_Prior_Rental_Date		DATE9.;
	FORMAT	_Prior2_Rental_Date		DATE9.;
	FORMAT	_Temp_Date				DATE9.;
	LENGTH	_Prior_Movie1			$32;
	LENGTH	_Prior_Movie2			$32;
	LENGTH	_Prior2_Movie1			$32;
	LENGTH	_Prior2_Movie2			$32;
	LENGTH	_Temp1					$32;
	LENGTH	_Temp2					$32;
	Switch_Status					=	'No ';
	_Prior_Customer					=	LAG(Customer);
	_Prior2_Customer				=	LAG2(Customer);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Prior2_Rental_Date				=	LAG2(Rental_Date);
	_Prior_Movie1					=	LAG(Movie1);
	_Prior_Movie2					=	LAG(Movie2);
	_Prior2_Movie1					=	LAG2(Movie1);
	_Prior2_Movie2					=	LAG2(Movie2);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Temp1							=	Prior_Movies[1];
	_Temp2							=	Prior_Movies[2];
	IF	FIRST.Customer																	THEN
		DO;
			CALL	MISSING(		_Prior_Customer
									,_Prior_Rental_Date
									,_Prior2_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2
									,_Prior_Rental_Date);
		END;
	
	IF	Rental_Date					=	_Prior_Rental_Date								THEN
		IF	_Prior2_Customer		=	Customer										THEN
			DO;
				_Prior_Rental_Date	=	_Prior2_Rental_Date;
				_Prior_Movie1		=	_Prior2_Movie1;
				_Prior_Movie2		=	_Prior2_Movie2;
			END;
		ELSE
			DO;
				CALL	MISSING(	_Prior_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2);
			END;
	IF	NOT	MISSING(_Prior_Rental_Date)													THEN
		_Temp_Date					=	INTNX('DAYS', _Prior_Rental_Date, Grace_Period);
	IF	NOT	MISSING(_Prior_Rental_Date)													AND
		Rental_Date					>	INTNX('DAYS', _Prior_Rental_Date, Grace_Period)	THEN
		IF	Movie1				NOT	IN	Prior_Movies									OR
			Movie2				NOT	IN	Prior_Movies									THEN
			DO;
				Switch_Status		=	'Yes';
			END;
RUN;
					
				
			
			
				
			
			
			
			
			
			
			
		hmmm interesting... I commented out the 3rd line:
DATA Cust_Movie_Data; INFILE DATALINES ; /* DSD DLM='09'x MISSOVER;*/
... and Cust_Movie_Data produced properly.
I am using SAS 9.4 (not SAS Studio, not Enterprise Guide).
@iced_tea wrote:
hmmm interesting... I commented out the 3rd line:
DATA Cust_Movie_Data; INFILE DATALINES ; /* DSD DLM='09'x MISSOVER;*/... and Cust_Movie_Data produced properly.
I am using SAS 9.4 (not SAS Studio, not Enterprise Guide).
Well, for your next SAS trivia challenge: You're using Display Manager. Display Manager is the Windows front end to SAS. SAS runs on mainframes, Unix, Linux, Posix (and probably others), so while a lot of people describe Display Manager as "SAS" or "Base SAS" really SAS is a much broader thing. OK, enough SAS trivia. 🙂
I think that makes sense that you would have to comment out that line. I think what's happening is that when I cut and paste into the code window that tabs are getting converted to spaces. When I copied your data over to my system, there were tabs in the data, so I added the DSD and DLM parameters.
Jim
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.