BookmarkSubscribeRSS Feed
user1
Calcite | Level 5

Hi All,

I'd like to ask if you can help about this. Thanks a lot!!!

data new;

input ID store_number visit_date yymmdd10.;

format visit_date yymmdd10.;

datalines;

1 21 20050615

1 21 20050708

1 21 20051023

1 28 20041203

2 10 20060123

2 60 20050709

2 19 20060825

3 25 20041010

3 91 20051201

3 64 20040609

4 13 20050606

5 26 20040809

5 26 20030911

;

run;

I’d like to get a new dataset to show that if a consumer has regular store to visit. That is, if he/she visited the same store(store number only between 10-29)more than 50% of his/her total store visits within 2 years of period, then he/she had regular store to visit. (But no including the person who only had one store to visit during the period, such as consumer #4). The new dataset should be like the following table.

Id

Regular_store_to_visit

(yes=1, no=0)

1

1

2

0

3

0

4

.

5

1

3 REPLIES 3
Haikuo
Onyx | Level 15

Ok, this is my try, it is really mouthful, and potentially slow,  but seems working on your data as is:

data new;

input ID store_number visit_date yymmdd10.;

format visit_date yymmdd10.;

datalines;

1 21 20050615

1 21 20050708

1 21 20051023

1 28 20041203

2 10 20060123

2 60 20050709

2 19 20060825

3 25 20041010

3 91 20051201

3 64 20040609

4 13 20050606

5 26 20040809

5 26 20030911

;

proc sql;

select a.id, case when missing(d) then .

                  else sum(case when sv/d>0.5 then 1 when sv/d<=0.5 then 0 else . end)>0

                  end  as regular_visit  from

  (select id, store_number, count(*) as sv from new group by id, store_number having

          intck('year',min(visit_date),max(visit_date))<=2) a

              left join

(select id,  count(*) as d from new group by id having count(*)>1) b

on a.id=b.id group by a.id;

quit;

Haikuo


Astounding
PROC Star

user1,

This is a little bit of a shot in the dark, since you omitted some key requirements.  You haven't really told us about how to limit the data to 2 years per customer.  And you haven't told us whether store_numbers outside the range of 10-29 should contribute to the denominator of our percent calculations.

This program is easily adjustable, depending on what the assumptions ought to be.  So assuming you have already performed your subsetting based on the dates, this is a reasonable approach.

proc sort data=new out=have;

   by id;

   where (10 <= store_number <= 29);

run;

proc freq data=have order=freq;

   by id;

   tables store_number / noprint out=counts;

run;

data want;

   set counts;

   by id;
   if first.id;

   if last.id then regular_store=.;

   else regular_store = (percent > 50);

run;

Note that this also keeps other information you may find interesting ... the most frequently visited store number, the number of visits, and the percent.  Also note that customers with zero visits to the stores of interest will not appear in the final data set.

Good luck.

user1
Calcite | Level 5

Thanks very much for your help!

Best regards.

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1342 views
  • 0 likes
  • 3 in conversation