Hi all
I am trying to find the number of females who have entered the shop within the past month with the data I have. I have written the code but somehow it doesn't seem to work. Your help is greatly appreciated.
Let me describe my data. I have 2 datasets:
Data1: this data has the shop id and date and looks something like this
Shop_id Date
1 23/3/2015
1 1/4/2015
2 1/3/2015
2 1/5/2015
Data2: this data has the shop id, the date of customer visits, the customer id and the gender of customer (0 for male and 1 for female)
Shop_id Date Cust_id Gender
1 1/3/2015 113 1
1 1/4/2015 115 0
2 3/3/2015 116 1
I want to create a new dataset, data3, which has the shop_id, the date and the number of female customers who have visited the shop in the past 1 month (prior to the date). For example, in row 1, i want to insert a new column, no_female, which is the number of females who have visited the shop 1 from 23/2/2015 to 22/3/2015.
This is the code I write, but it doesn't seem to work:
Proc sql;
Create table data3 as
Select a.*,sum(b.gender) as no_female
from data1 as a, data2 as b
where a.shop_id = b.shop_id and -30 <= intck('day',a.date,b.date) < 0
group by a.shop_id, a.date;
quit;
Your help is greatly appreciated!! Thanks!!!
Something like the below, note that it is not optimal coding, it is more to describe the thinking behind it:
data have1;
informat date date9.;
input Shop_id Date;
datalines;
1 23MAR2015
1 01APR2015
2 01MAR2015
2 01MAY2015
;
run;
data have2;
informat date date9.;
input Shop_id Date Cust_id Gender;
datalines;
1 01MAR2015 113 1
1 01APR2015 115 0
2 03MAR2015 116 1
;
run;
proc sql;
create table WANT as
select distinct SHOP_ID
,DATE
,count(CUST_ID) as RESULT
from (select A.*
,B.CUST_ID
from HAVE1 A
left join (select * from HAVE2 where GENDER=1) B
on A.SHOP_ID=B.SHOP_ID
and intnx('month',A.DATE,-1) <= B.DATE <= A.DATE)
group by SHOP_ID,DATE;
quit;
Ok, so in the subquery, the from (), I join all female records back to the main dataset based on date being between date-1 month and date. Then in the main query I count these using a by group.
First thing: make sure your dates as SAS date valued variables.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.