BookmarkSubscribeRSS Feed
iced_tea
Obsidian | Level 7

Hello Community,

Is it something that could be done using SAS?

I have a table like this:

Customer

Start_Date

Rental_Date

Grace_Period

MovieTitle

Movie1

Movie2

Movies_List

12345

13-Mar-13

13-Mar-13

30

Gone Girl

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

13-Mar-13

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

8-Apr-13

30

Gone Girl

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

8-Apr-13

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

4-May-13

30

Gone Girl

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

4-May-13

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

11-Feb-15

30

Hunger Games

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

11-Feb-15

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

67890

10-Aug-14

10-Aug-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

8-Sep-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

17-Oct-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

29-Nov-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

31-Dec-14

30

Booksmart

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

10-Feb-15

30

Booksmart

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

22-May-15

30

Booksmart

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

29-Jun-15

30

Booksmart

Pitch Perfect

 

Pitch Perfect

… where there are many rows per 1 customer and the table records the dates (Rental_Date) and the movies (MovieTitle) the customer rented on those dates. Sometimes a customer rents more than 1 movie, ex: customer 12345. This customer rented two movies on March 13, 2013.

What I want to do is to create a new column called “Switch_Status” that records whether the customer switched movies after whatever their initial movie/movies selection was. For example, customer 12345 initially rented 2 movies: Gone Girl and Titanic. I want the new column to record “Switch_Status” = “yes” because in the time Rental_Date+Grace_Period, they eventually stopped renting Gone Girl and Titanic, and instead rented Hunger Games and Titanic (February 11, 2015).

The second example customer 67890 is easier to record. Here, this customer initially rented Pitch Perfect and then after Rental_Date+Grace Period, there is evidence that they switched the movie they initially rented (Pitch Perfect), and instead rented Booksmart. So for this customer also, “Switch_Status” would be “yes”.

And obviously, if there is no evidence of switch, I want “Switch_Status” to equal “no”.

Is there any easy way to do this?

Thank you!

17 REPLIES 17
Reeza
Super User

Sure, this is doable but I'd say 2 years out is really not within any grace period...

 

It would help if you showed a smaller, realistic example and the expected outcome. A description goes only so far, whereas showing the expected output for an input data set really helps to illustrate what you're trying to accomplish. 

 


@iced_tea wrote:

Hello Community,

Is it something that could be done using SAS?

I have a table like this:

Customer

Start_Date

Rental_Date

Grace_Period

MovieTitle

Movie1

Movie2

Movies_List

12345

13-Mar-13

13-Mar-13

30

Gone Girl

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

13-Mar-13

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

8-Apr-13

30

Gone Girl

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

8-Apr-13

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

4-May-13

30

Gone Girl

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

4-May-13

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

11-Feb-15

30

Hunger Games

Gone Girl

Titanic

Gone Girl,Titanic

12345

13-Mar-13

11-Feb-15

30

Titanic

Gone Girl

Titanic

Gone Girl,Titanic

67890

10-Aug-14

10-Aug-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

8-Sep-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

17-Oct-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

29-Nov-14

30

Pitch Perfect

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

31-Dec-14

30

Booksmart

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

10-Feb-15

30

Booksmart

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

22-May-15

30

Booksmart

Pitch Perfect

 

Pitch Perfect

67890

10-Aug-14

29-Jun-15

30

Booksmart

Pitch Perfect

 

Pitch Perfect

… where there are many rows per 1 customer and the table records the dates (Rental_Date) and the movies (MovieTitle) the customer rented on those dates. Sometimes a customer rents more than 1 movie, ex: customer 12345. This customer rented two movies on March 13, 2013.

What I want to do is to create a new column called “Switch_Status” that records whether the customer switched movies after whatever their initial movie/movies selection was. For example, customer 12345 initially rented 2 movies: Gone Girl and Titanic. I want the new column to record “Switch_Status” = “yes” because in the time Rental_Date+Grace_Period, they eventually stopped renting Gone Girl and Titanic, and instead rented Hunger Games and Titanic (February 11, 2015).

The second example customer 67890 is easier to record. Here, this customer initially rented Pitch Perfect and then after Rental_Date+Grace Period, there is evidence that they switched the movie they initially rented (Pitch Perfect), and instead rented Booksmart. So for this customer also, “Switch_Status” would be “yes”.

And obviously, if there is no evidence of switch, I want “Switch_Status” to equal “no”.

Is there any easy way to do this?

Thank you!


 

jimbarbour
Meteorite | Level 14

I'm wouldn't say this is easy, but it can be done.  Here are the results, below, and at the very bottom, the SAS code.  Note:  I changed the last two rental dates for customer 12345 so that they would be within the 30 day grace period.

 

Is this what you were looking for?

 

Jim

 

jimbarbour_0-1602559148035.png

 

DATA	Cust_Movie_Data_Switch;
	DROP	_:;

	SET		Cust_Movie_Data;
		BY	Customer	Rental_Date;

	ARRAY	Prior_Movies	[*]		$32	_Prior_Movie1	_Prior_Movie2;

	FORMAT	_Prior_Rental_Date		DATE9.;
	FORMAT	_Prior2_Rental_Date		DATE9.;
	FORMAT	_Temp_Date				DATE9.;

	LENGTH	_Prior_Movie1			$32;
	LENGTH	_Prior_Movie2			$32;
	LENGTH	_Prior2_Movie1			$32;
	LENGTH	_Prior2_Movie2			$32;
	LENGTH	_Temp1					$32;
	LENGTH	_Temp2					$32;

	Switch_Status					=	'No ';

	_Prior_Customer					=	LAG(Customer);
	_Prior2_Customer				=	LAG2(Customer);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Prior2_Rental_Date				=	LAG2(Rental_Date);
	_Prior_Movie1					=	LAG(Movie1);
	_Prior_Movie2					=	LAG(Movie2);
	_Prior2_Movie1					=	LAG2(Movie1);
	_Prior2_Movie2					=	LAG2(Movie2);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Temp1							=	Prior_Movies[1];
	_Temp2							=	Prior_Movies[2];

	IF	FIRST.Customer																	THEN
		DO;
			CALL	MISSING(		_Prior_Customer
									,_Prior_Rental_Date
									,_Prior2_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2
									,_Prior_Rental_Date);
		END;
	
	IF	Rental_Date					=	_Prior_Rental_Date								THEN
		IF	_Prior2_Customer		=	Customer										THEN
			DO;
				_Prior_Rental_Date	=	_Prior2_Rental_Date;
				_Prior_Movie1		=	_Prior2_Movie1;
				_Prior_Movie2		=	_Prior2_Movie2;
			END;
		ELSE
			DO;
				CALL	MISSING(	_Prior_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2);
			END;

	IF	NOT	MISSING(_Prior_Rental_Date)													THEN
		_Temp_Date					=	INTNX('DAYS', _Prior_Rental_Date, Grace_Period);

	IF	NOT	MISSING(_Prior_Rental_Date)													AND
		Rental_Date					<	INTNX('DAYS', _Prior_Rental_Date, Grace_Period)	THEN
		IF	Movie1				NOT	IN	Prior_Movies									OR
			Movie2				NOT	IN	Prior_Movies									THEN
			DO;
				Switch_Status		=	'Yes';
			END;
RUN;
iced_tea
Obsidian | Level 7

Thank you Jim. I really appreciate your taking a look.

The only thing is that - in the case of the last two rental dates for customer 12345 being way outside the 30 days, I still would want this customer to be flagged as "Switch_Status" = "yes".

My apologies. I might have not made it clear. So, if the customer has a different movie/movies within Rental_Date+Grace_Period, then it's no switch. It is a switch if they have different movie/movies AFTER Rental_Date+Grace_Period. Does that make sense?

jimbarbour
Meteorite | Level 14

Ah, ah, I see.  I misunderstood.  I assumed the opposite, that within the grace period was considered a switch.

 

Let me reverse the logic.

 

Jim

jimbarbour
Meteorite | Level 14

Yes, all I had to do was reverse the less than/greater than symbol.  Now record 13 also shows as a switch ("Yes").  Record 13 is a day or two past the grace period.

 

Results and code are below.

 

Jim

 

jimbarbour_0-1602601046280.png

 

DATA	Cust_Movie_Data;
	INFILE	DATALINES
		DSD	DLM='09'x	MISSOVER;

	INPUT	Customer		$
			Start_Date		:	ANYDTDTE10.
			Rental_Date		:	ANYDTDTE10.
			Grace_Period	
			MovieTitle		:	$32.
			Movie1			:	$32.
			Movie2			:	$32.
			Movies_List		:	$64.
			;

	FORMAT	Start_Date			DATE9.;
	FORMAT	Rental_Date			DATE9.;

DATALINES;
12345	13-Mar-13	13-Mar-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	13-Mar-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	11-Feb_15	30	Hunger Games	Hunger Games	Titanic	Hunger Games,Titanic
12345	13-Mar-13	11-Feb_15	30	Titanic	Hunger Games	Titanic	Hunger Games,Titanic
67890	10-Aug-14	10-Aug-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	8-Sep-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	17-Oct-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	29-Nov-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	31-Dec-14	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	10-Feb-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	22-May-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	29-Jun-15	30	Booksmart	Booksmart	Booksmart
;
RUN;

PROC	SORT		DATA=Cust_Movie_Data;
	BY	Customer	Rental_Date;
RUN;

DATA	Cust_Movie_Data_Switch;
	DROP	_:;

	SET		Cust_Movie_Data;
		BY	Customer	Rental_Date;

	ARRAY	Prior_Movies	[*]		$32	_Prior_Movie1	_Prior_Movie2;

	FORMAT	_Prior_Rental_Date		DATE9.;
	FORMAT	_Prior2_Rental_Date		DATE9.;
	FORMAT	_Temp_Date				DATE9.;

	LENGTH	_Prior_Movie1			$32;
	LENGTH	_Prior_Movie2			$32;
	LENGTH	_Prior2_Movie1			$32;
	LENGTH	_Prior2_Movie2			$32;
	LENGTH	_Temp1					$32;
	LENGTH	_Temp2					$32;

	Switch_Status					=	'No ';

	_Prior_Customer					=	LAG(Customer);
	_Prior2_Customer				=	LAG2(Customer);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Prior2_Rental_Date				=	LAG2(Rental_Date);
	_Prior_Movie1					=	LAG(Movie1);
	_Prior_Movie2					=	LAG(Movie2);
	_Prior2_Movie1					=	LAG2(Movie1);
	_Prior2_Movie2					=	LAG2(Movie2);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Temp1							=	Prior_Movies[1];
	_Temp2							=	Prior_Movies[2];

	IF	FIRST.Customer																	THEN
		DO;
			CALL	MISSING(		_Prior_Customer
									,_Prior_Rental_Date
									,_Prior2_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2
									,_Prior_Rental_Date);
		END;
	
	IF	Rental_Date					=	_Prior_Rental_Date								THEN
		IF	_Prior2_Customer		=	Customer										THEN
			DO;
				_Prior_Rental_Date	=	_Prior2_Rental_Date;
				_Prior_Movie1		=	_Prior2_Movie1;
				_Prior_Movie2		=	_Prior2_Movie2;
			END;
		ELSE
			DO;
				CALL	MISSING(	_Prior_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2);
			END;

	IF	NOT	MISSING(_Prior_Rental_Date)													THEN
		_Temp_Date					=	INTNX('DAYS', _Prior_Rental_Date, Grace_Period);

	IF	NOT	MISSING(_Prior_Rental_Date)													AND
		Rental_Date					>	INTNX('DAYS', _Prior_Rental_Date, Grace_Period)	THEN
		IF	Movie1				NOT	IN	Prior_Movies									OR
			Movie2				NOT	IN	Prior_Movies									THEN
			DO;
				Switch_Status		=	'Yes';
			END;
RUN;
Reeza
Super User
I think the NOTSORTED option would be useful here.

jimbarbour
Meteorite | Level 14

@Reeza,

 

Yes, probably the better option.  With such a small dataset, I just threw in a SORT, but perhaps that's a bit lazy on my part.  😳

 

Jim

iced_tea
Obsidian | Level 7

Interesting how it's working for you. I see the output you got. That's exactly what I'd like. But for me, even running the code to create Cust_Movie_Data only produces an empty table. 

 

I tried reworking your code for my real data, and everyone ends up being Switch_Status = No, even though it is obvious some should be 'Yes', just by looking at their data row by row.

 

 

 
 

 

 

 

 
 

 

 

jimbarbour
Meteorite | Level 14

That's very very strange.  You cut in paste every line of code even the DATALINES?

 

Are you using SAS Studio?  Display Manager?  Enterprise Guide?  Which one are you using?

 

Try closing your session, opening a brand new session and cutting and pasting in the below code.  I changed the DATALINES back to February 2015, so it's important to get these DATALINES, not the ones from before where I had the requirements backwards.

 

Jim

 

DATA	Cust_Movie_Data;
	INFILE	DATALINES
		DSD	DLM='09'x	MISSOVER;

	INPUT	Customer		$
			Start_Date		:	ANYDTDTE10.
			Rental_Date		:	ANYDTDTE10.
			Grace_Period	
			MovieTitle		:	$32.
			Movie1			:	$32.
			Movie2			:	$32.
			Movies_List		:	$64.
			;

	FORMAT	Start_Date			DATE9.;
	FORMAT	Rental_Date			DATE9.;

DATALINES;
12345	13-Mar-13	13-Mar-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	13-Mar-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	8-Apr-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Gone Girl	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	4-May-13	30	Titanic	Gone Girl	Titanic	Gone Girl,Titanic
12345	13-Mar-13	11-Feb_15	30	Hunger Games	Hunger Games	Titanic	Hunger Games,Titanic
12345	13-Mar-13	11-Feb_15	30	Titanic	Hunger Games	Titanic	Hunger Games,Titanic
67890	10-Aug-14	10-Aug-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	8-Sep-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	17-Oct-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	29-Nov-14	30	Pitch Perfect	Pitch Perfect	Pitch Perfect
67890	10-Aug-14	31-Dec-14	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	10-Feb-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	22-May-15	30	Booksmart	Booksmart	Booksmart
67890	10-Aug-14	29-Jun-15	30	Booksmart	Booksmart	Booksmart
;
RUN;

PROC	SORT		DATA=Cust_Movie_Data;
	BY	Customer	Rental_Date;
RUN;

DATA	Cust_Movie_Data_Switch;
	DROP	_:;

	SET		Cust_Movie_Data;
		BY	Customer	Rental_Date;

	ARRAY	Prior_Movies	[*]		$32	_Prior_Movie1	_Prior_Movie2;

	FORMAT	_Prior_Rental_Date		DATE9.;
	FORMAT	_Prior2_Rental_Date		DATE9.;
	FORMAT	_Temp_Date				DATE9.;

	LENGTH	_Prior_Movie1			$32;
	LENGTH	_Prior_Movie2			$32;
	LENGTH	_Prior2_Movie1			$32;
	LENGTH	_Prior2_Movie2			$32;
	LENGTH	_Temp1					$32;
	LENGTH	_Temp2					$32;

	Switch_Status					=	'No ';

	_Prior_Customer					=	LAG(Customer);
	_Prior2_Customer				=	LAG2(Customer);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Prior2_Rental_Date				=	LAG2(Rental_Date);
	_Prior_Movie1					=	LAG(Movie1);
	_Prior_Movie2					=	LAG(Movie2);
	_Prior2_Movie1					=	LAG2(Movie1);
	_Prior2_Movie2					=	LAG2(Movie2);
	_Prior_Rental_Date				=	LAG(Rental_Date);
	_Temp1							=	Prior_Movies[1];
	_Temp2							=	Prior_Movies[2];

	IF	FIRST.Customer																	THEN
		DO;
			CALL	MISSING(		_Prior_Customer
									,_Prior_Rental_Date
									,_Prior2_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2
									,_Prior_Rental_Date);
		END;
	
	IF	Rental_Date					=	_Prior_Rental_Date								THEN
		IF	_Prior2_Customer		=	Customer										THEN
			DO;
				_Prior_Rental_Date	=	_Prior2_Rental_Date;
				_Prior_Movie1		=	_Prior2_Movie1;
				_Prior_Movie2		=	_Prior2_Movie2;
			END;
		ELSE
			DO;
				CALL	MISSING(	_Prior_Rental_Date
									,_Prior_Movie1
									,_Prior_Movie2);
			END;

	IF	NOT	MISSING(_Prior_Rental_Date)													THEN
		_Temp_Date					=	INTNX('DAYS', _Prior_Rental_Date, Grace_Period);

	IF	NOT	MISSING(_Prior_Rental_Date)													AND
		Rental_Date					>	INTNX('DAYS', _Prior_Rental_Date, Grace_Period)	THEN
		IF	Movie1				NOT	IN	Prior_Movies									OR
			Movie2				NOT	IN	Prior_Movies									THEN
			DO;
				Switch_Status		=	'Yes';
			END;
RUN;
iced_tea
Obsidian | Level 7

hmmm interesting... I commented out the 3rd line:

DATA	Cust_Movie_Data;
	INFILE	DATALINES ;
/*		DSD	DLM='09'x	MISSOVER;*/

... and Cust_Movie_Data produced properly.

I am using SAS 9.4 (not SAS Studio, not Enterprise Guide).

Reeza
Super User
You shouldn't need that step at all, that creates the example data. You should be replacing the data with your own input data set.
iced_tea
Obsidian | Level 7
Yes Reeza, I agree. I just wanted to see Jim's example and how it is playing out for myself.
jimbarbour
Meteorite | Level 14

@iced_tea wrote:

hmmm interesting... I commented out the 3rd line:

DATA	Cust_Movie_Data;
	INFILE	DATALINES ;
/*		DSD	DLM='09'x	MISSOVER;*/

... and Cust_Movie_Data produced properly.

I am using SAS 9.4 (not SAS Studio, not Enterprise Guide).


Well, for your next SAS trivia challenge:  You're using Display Manager.  Display Manager is the Windows front end to SAS.  SAS runs on mainframes, Unix, Linux, Posix (and probably others), so while a lot of people describe Display Manager as "SAS" or "Base SAS" really SAS is a much broader thing.  OK, enough SAS trivia.  🙂

 

I think that makes sense that you would have to comment out that line.  I think what's happening is that when I cut and paste into the code window that tabs are getting converted to spaces.  When I copied your data over to my system, there were tabs in the data, so I added the DSD and DLM parameters.

 

Jim

iced_tea
Obsidian | Level 7
Thank you for educating me. There is so much to learn!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1138 views
  • 7 likes
  • 3 in conversation