Hello!
I want to find repeated users who always do cash withdrawals between set weeks.
Right now I have it set up in a way where I have created tables for each week containing userid, transaction type and dates.
Tables: W1,W2,W3
Now to find the same users present every week I figure an inner join between the tables would do the job
PROC SQL; Select a.id FROM W1 A JOIN W2 B ON.. JOIN W3 C ON.. WHERE transaction type = ; QUIT;
I assume this would give the common users each week?
But what if I have created one big table with all the dates included. How would I go about in finding the repeated users each week?
Would keeping the tables separate make more sense?
IF you have actual SAS date values then Proc Summary and a careful selection of format should work for a long data set as described to find repeated users for a given week. This is the approach I would take instead of working with multiple tables.
Proc summary data=have nway; class idvariable datevariable; format datevariable weeku5. ; output out=want (drop=_type_ where=(_freq_>1)); run;
There are 3 different week formats, WeekU WeekV and WeekW with the differences being how the weeks crossing the calendar year are treated and the start day of a week.
The output data set will have the two class variables and the automatic variable _freq_ for the count of the number of observations that have the class variable combination. The date variable will look something like 19W03 (week 3 of 2019) but is still a date variable and changing the format will make it appear "nicer".
If you don't like the format approach to creating group you could take another step in a data step or sql to use the INTNX function to add a variable that represents the start or end day of a week and use that to group on.
IF you have actual SAS date values then Proc Summary and a careful selection of format should work for a long data set as described to find repeated users for a given week. This is the approach I would take instead of working with multiple tables.
Proc summary data=have nway; class idvariable datevariable; format datevariable weeku5. ; output out=want (drop=_type_ where=(_freq_>1)); run;
There are 3 different week formats, WeekU WeekV and WeekW with the differences being how the weeks crossing the calendar year are treated and the start day of a week.
The output data set will have the two class variables and the automatic variable _freq_ for the count of the number of observations that have the class variable combination. The date variable will look something like 19W03 (week 3 of 2019) but is still a date variable and changing the format will make it appear "nicer".
If you don't like the format approach to creating group you could take another step in a data step or sql to use the INTNX function to add a variable that represents the start or end day of a week and use that to group on.
To answer the SQL/Design part of it, no. I would have one big table with the value of a week field differentiating between weeks.
I would try to go down the road @ballardw is suggesting.
If you don't want to do that or don't understand it:
data transactions;
input week customerid;
datalines;
1 1
2 1
3 1
4 1
5 1
6 1
1 2
3 2
6 2
;
run;
proc sql;
select
customerid
from
transactions
where
week >=2 and week<=4
group by customerid
having count(customerid) = 3;
quit;
shows that
customerid |
1 |
Customer 1 is your loyal weekly with this data.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.