Hi all
I am trying to find the number of females who have entered the shop within the past month with the data I have. I have written the code but somehow it doesn't seem to work. Your help is greatly appreciated.
Let me describe my data. I have 2 datasets:
Data1: this data has the shop id and date and looks something like this
Shop_id Date
1 23/3/2015
1 1/4/2015
2 1/3/2015
2 1/5/2015
Data2: this data has the shop id, the date of customer visits, the customer id and the gender of customer (0 for male and 1 for female)
Shop_id Date Cust_id Gender
1 1/3/2015 113 1
1 1/4/2015 115 0
2 3/3/2015 116 1
I want to create a new dataset, data3, which has the shop_id, the date and the number of female customers who have visited the shop in the past 1 month (prior to the date). For example, in row 1, i want to insert a new column, no_female, which is the number of females who have visited the shop 1 from 23/2/2015 to 22/3/2015.
This is the code I write, but it doesn't seem to work:
Proc sql;
Create table data3 as
Select a.*,sum(b.gender) as no_female
from data1 as a, data2 as b
where a.shop_id = b.shop_id and -30 <= intck('day',a.date,b.date) < 0
group by a.shop_id, a.date;
quit;
Your help is greatly appreciated!! Thanks!!!
Something like the below, note that it is not optimal coding, it is more to describe the thinking behind it:
data have1;
informat date date9.;
input Shop_id Date;
datalines;
1 23MAR2015
1 01APR2015
2 01MAR2015
2 01MAY2015
;
run;
data have2;
informat date date9.;
input Shop_id Date Cust_id Gender;
datalines;
1 01MAR2015 113 1
1 01APR2015 115 0
2 03MAR2015 116 1
;
run;
proc sql;
create table WANT as
select distinct SHOP_ID
,DATE
,count(CUST_ID) as RESULT
from (select A.*
,B.CUST_ID
from HAVE1 A
left join (select * from HAVE2 where GENDER=1) B
on A.SHOP_ID=B.SHOP_ID
and intnx('month',A.DATE,-1) <= B.DATE <= A.DATE)
group by SHOP_ID,DATE;
quit;
Ok, so in the subquery, the from (), I join all female records back to the main dataset based on date being between date-1 month and date. Then in the main query I count these using a by group.
First thing: make sure your dates as SAS date valued variables.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.