Solved: Find repeating users between set weeks

darklord · Posted 09-11-2023 01:43 PM

Hello!

I want to find repeated users who always do cash withdrawals between set weeks.

Right now I have it set up in a way where I have created tables for each week containing userid, transaction type and dates.

Tables: W1,W2,W3

Now to find the same users present every week I figure an inner join between the tables would do the job

PROC SQL;
Select a.id
FROM W1 A JOIN W2 B ON.. JOIN W3 C ON..
WHERE transaction type = ;
QUIT;

I assume this would give the common users each week?

But what if I have created one big table with all the dates included. How would I go about in finding the repeated users each week?

Would keeping the tables separate make more sense?

ballardw · Posted 09-11-2023 04:07 PM

IF you have actual SAS date values then Proc Summary and a careful selection of format should work for a long data set as described to find repeated users for a given week. This is the approach I would take instead of working with multiple tables.

Proc summary data=have nway;
   class idvariable datevariable;
   format datevariable weeku5. ;
   output out=want (drop=_type_ where=(_freq_>1));
run;

There are 3 different week formats, WeekU WeekV and WeekW with the differences being how the weeks crossing the calendar year are treated and the start day of a week.

The output data set will have the two class variables and the automatic variable _freq_ for the count of the number of observations that have the class variable combination. The date variable will look something like 19W03 (week 3 of 2019) but is still a date variable and changing the format will make it appear "nicer".

If you don't like the format approach to creating group you could take another step in a data step or sql to use the INTNX function to add a variable that represents the start or end day of a week and use that to group on.

View solution in original post

ballardw · Posted 09-11-2023 04:07 PM

IF you have actual SAS date values then Proc Summary and a careful selection of format should work for a long data set as described to find repeated users for a given week. This is the approach I would take instead of working with multiple tables.

Proc summary data=have nway;
   class idvariable datevariable;
   format datevariable weeku5. ;
   output out=want (drop=_type_ where=(_freq_>1));
run;

There are 3 different week formats, WeekU WeekV and WeekW with the differences being how the weeks crossing the calendar year are treated and the start day of a week.

The output data set will have the two class variables and the automatic variable _freq_ for the count of the number of observations that have the class variable combination. The date variable will look something like 19W03 (week 3 of 2019) but is still a date variable and changing the format will make it appear "nicer".

If you don't like the format approach to creating group you could take another step in a data step or sql to use the INTNX function to add a variable that represents the start or end day of a week and use that to group on.

darklord · Posted 09-12-2023 04:47 AM

Hello,
Thanks for this. It did help in the frequency of the users. Which helped in giving a nice picture of the repeat ones.

I decided not to put the date in since it wasn't very much needed 🙂

HB · Posted 09-11-2023 04:38 PM

To answer the SQL/Design part of it, no. I would have one big table with the value of a week field differentiating between weeks.

I would try to go down the road @ballardw is suggesting.

If you don't want to do that or don't understand it:

data transactions;
   input week customerid;
datalines;
1 1
2 1
3 1
4 1
5 1
6 1
1 2
3 2
6 2
;
run;


proc sql;
select 
customerid
from 
transactions
where
week >=2 and week<=4
group by customerid
having count(customerid) = 3;
quit;

shows that

customerid

1

Customer 1 is your loyal weekly with this data.

darklord · Posted 09-12-2023 04:48 AM

Hello,
Thanks for the help, I did end up using what
@ballardw suggested and it worked well!
Should keep summary in mind for next time use!

Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

Re: Find repeating users between set weeks

SAS Innovate 2025: Call for Content

Click image to register for webinar

Classroom Training Available!