Hello!
I want to find repeated users who always do cash withdrawals between set weeks.
Right now I have it set up in a way where I have created tables for each week containing userid, transaction type and dates.
Tables: W1,W2,W3
Now to find the same users present every week I figure an inner join between the tables would do the job
PROC SQL; Select a.id FROM W1 A JOIN W2 B ON.. JOIN W3 C ON.. WHERE transaction type = ; QUIT;
I assume this would give the common users each week?
But what if I have created one big table with all the dates included. How would I go about in finding the repeated users each week?
Would keeping the tables separate make more sense?
IF you have actual SAS date values then Proc Summary and a careful selection of format should work for a long data set as described to find repeated users for a given week. This is the approach I would take instead of working with multiple tables.
Proc summary data=have nway; class idvariable datevariable; format datevariable weeku5. ; output out=want (drop=_type_ where=(_freq_>1)); run;
There are 3 different week formats, WeekU WeekV and WeekW with the differences being how the weeks crossing the calendar year are treated and the start day of a week.
The output data set will have the two class variables and the automatic variable _freq_ for the count of the number of observations that have the class variable combination. The date variable will look something like 19W03 (week 3 of 2019) but is still a date variable and changing the format will make it appear "nicer".
If you don't like the format approach to creating group you could take another step in a data step or sql to use the INTNX function to add a variable that represents the start or end day of a week and use that to group on.
IF you have actual SAS date values then Proc Summary and a careful selection of format should work for a long data set as described to find repeated users for a given week. This is the approach I would take instead of working with multiple tables.
Proc summary data=have nway; class idvariable datevariable; format datevariable weeku5. ; output out=want (drop=_type_ where=(_freq_>1)); run;
There are 3 different week formats, WeekU WeekV and WeekW with the differences being how the weeks crossing the calendar year are treated and the start day of a week.
The output data set will have the two class variables and the automatic variable _freq_ for the count of the number of observations that have the class variable combination. The date variable will look something like 19W03 (week 3 of 2019) but is still a date variable and changing the format will make it appear "nicer".
If you don't like the format approach to creating group you could take another step in a data step or sql to use the INTNX function to add a variable that represents the start or end day of a week and use that to group on.
To answer the SQL/Design part of it, no. I would have one big table with the value of a week field differentiating between weeks.
I would try to go down the road @ballardw is suggesting.
If you don't want to do that or don't understand it:
data transactions;
input week customerid;
datalines;
1 1
2 1
3 1
4 1
5 1
6 1
1 2
3 2
6 2
;
run;
proc sql;
select
customerid
from
transactions
where
week >=2 and week<=4
group by customerid
having count(customerid) = 3;
quit;
shows that
customerid |
1 |
Customer 1 is your loyal weekly with this data.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.