BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PhatRam33
Fluorite | Level 6

This is a sample of my dataset.  I want to identify any ID where they purchased the same fruit in consecutive months at any time during a given date range.  We can use 01/01/2017 - 12/31/2017 as an example (this date range will be expanded at a later time).

 

In this example, ID 111 and 222 should be returned only or flagged in some way that I can identify as such.  Looking to code this in PROC SQL, however, it its easier to use a data step instead that will work too.

 

Thanks in advance.

 

ID PURCHASE_DATE FRUIT
111 01/20/2017 Oranges
111 02/01/2017 Oranges
222 04/24/2017 Apples
222 05/12/2017 Apples
222 08/28/2017 Apples
222 10/10/2017 Apples
333 03/14/2016 Pears
333 09/23/2016 Pears
1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Use

 

duration = intck("month", lag(purchase_date), purchase_date);
if not first.fruit and duration <= 1 then flag = 1;

instead.

 

PG

View solution in original post

11 REPLIES 11
Reeza
Super User

Whats your expected output from the input data set?

Astounding
PROC Star

It would be easier if you would reconsider the definition of "consecutive".  How about within 30 days?  After all, consecutive months would treat these differently:

 

03/01/2016 to 04/30/2016 = consecutive

 

01/31/2016 to 03/01/2016 = not consecutive, even though the duration between purchases is smaller

PhatRam33
Fluorite | Level 6

Yes, I would define as consecutive months would be within 30 days.

Astounding
PROC Star

OK, here's a way to set the flags:

 

proc sort data=have;

by id fruit purchase_date;

run;

data want;

set have;

by id fruit;

duration = dif(purchase_date);

if first.fruit = 0 and duration <= 30 then flag=1;

drop duration;

run;

 

Once the observations have been flagged, you still have to decide what should happen next.  And your dates must be valid SAS dates, not character strings.

PhatRam33
Fluorite | Level 6

Thank you very much @Astounding

 

I was able to get this to work and formatted my date field as such with a separate data step:

purchase_date=datepart(purchase_date);

format purchase_date date9.;

 

111 returned with value of 1 for the second row and 222 with a value of 1 for the 4th row.

 

For now I was just planning on creating a proc sql  and query on any Ids where they were flagged with a value of 1 and pull in any additional detail from another data set.  This is a work in progress, however wanted to get this first foundational step complete.

 

I did have a question though, if say the date fell on 01/01/2017 and the next date occurred on 02/25/2017 would fall greater than 30 days, would I need to expand my criteria greater than 30 or is there a way that sas can format the dates as months and then look for consecutive in that manner.  Sorry for the loaded question.

SuryaKiran
Meteorite | Level 14

Hi,

 

try this,

 

DATA HAVE ;
format PURCHASE_DATE mmddyy10.;
INFILE DATALINES DLM=",";
INPUT ID :3. PURCHASE_DATE : mmddyy10. FRUIT $;
DATALINES;
111,01/20/2017,Oranges
111,02/01/2017,Oranges
222,04/24/2017,Apples
222,05/12/2017,Apples
222,08/28/2017,Apples
222,10/10/2017,Apples
333,03/14/2016,Pears
333,09/23/2016,Pears
;
RUN;

PROC SORT DATA=HAVE;
BY ID FRUIT PURCHASE_DATE ;
RUN;
%let Start_Date=01JAN2017;
%let End_Date=31DEC2017;

DATA WANT;
FORMAT Lag_Date mmddyy10.;
SET HAVE;
BY ID FRUIT PURCHASE_DATE ;
Lag_Date=LAG(PURCHASE_DATE);
VAL=(MONTH(PURCHASE_DATE)+YEAR(PURCHASE_DATE))-(MONTH(LAG(PURCHASE_DATE))+YEAR(LAG(PURCHASE_DATE)));
IF FIRST.FRUIT=0 and "&Start_Date"D<=LAG(PURCHASE_DATE)<="&End_Date"D and val=1;
run;
Thanks,
Suryakiran
PhatRam33
Fluorite | Level 6

Thank you @SuryaKiran.

 

Would your code still work if say the end_date occurred in a different year than the start_date as say 01/31/2018?

 

 

PGStats
Opal | Level 21

Use

 

duration = intck("month", lag(purchase_date), purchase_date);
if not first.fruit and duration <= 1 then flag = 1;

instead.

 

PG
PhatRam33
Fluorite | Level 6

Thank you @PGStats.  Similar question to what I asked @SuryaKiran, would this work if my end date fell under a different year such as one in 2018?

 

I appreciate all the help everyone! I am truly grateful!

 

PGStats
Opal | Level 21

Don't be afraid to do your own testing, it is an excellent learning opportunity. Yes, SAS date functions work across years.

PG
PhatRam33
Fluorite | Level 6

Many thanks will do just that and mark solution later today.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2043 views
  • 1 like
  • 5 in conversation