BookmarkSubscribeRSS Feed
Truc
Calcite | Level 5

Hi all

 

I am trying to find the number of females who have entered the shop within the past month with the data I have. I have written the code but somehow it doesn't seem to work. Your help is greatly appreciated.

 

Let me describe my data. I have 2 datasets:

Data1: this data has the shop id and date  and looks something like this

Shop_id             Date

1                        23/3/2015

1                        1/4/2015

2                        1/3/2015

2                         1/5/2015

 

Data2: this data has the shop id, the date of customer visits, the customer id and the gender of customer (0 for male and 1 for female)

Shop_id             Date                     Cust_id              Gender

1                         1/3/2015               113                     1

1                         1/4/2015               115                     0

2                         3/3/2015               116                     1

 

I want to create a new dataset, data3, which has the shop_id, the date and the number of female customers who have visited the shop in the past 1 month (prior to the date). For example, in row 1, i want to insert a new column, no_female, which is the number of females who have visited the shop 1 from 23/2/2015 to 22/3/2015. 

 

This is the code I write, but it doesn't seem to work:

 

Proc sql;

Create table data3 as

Select a.*,sum(b.gender) as no_female

from data1 as a, data2 as b

where a.shop_id = b.shop_id and -30 <= intck('day',a.date,b.date) < 0

group by a.shop_id, a.date;

quit;

 

Your help is greatly appreciated!! Thanks!!!

2 REPLIES 2
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like the below, note that it is not optimal coding, it is more to describe the thinking behind it:

data have1;
  informat date date9.;
  input Shop_id Date;
datalines;
1 23MAR2015
1 01APR2015
2 01MAR2015
2 01MAY2015
;
run;

data have2;
  informat date date9.;
  input Shop_id Date Cust_id Gender;
datalines;
1 01MAR2015 113 1
1 01APR2015 115 0
2 03MAR2015 116 1
;
run;

proc sql;
  create table WANT as
  select  distinct  SHOP_ID
                    ,DATE
                    ,count(CUST_ID) as RESULT
  from  (select A.*
                ,B.CUST_ID
         from   HAVE1 A
         left join (select * from HAVE2 where GENDER=1) B
         on     A.SHOP_ID=B.SHOP_ID
         and    intnx('month',A.DATE,-1) <= B.DATE <= A.DATE)
  group by SHOP_ID,DATE;
quit;

Ok, so in the subquery, the from (), I join all female records back to the main dataset based on date being between date-1 month and date.  Then in the main query I count these using a by group.

ballardw
Super User

First thing: make sure your dates as SAS date valued variables.

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 631 views
  • 0 likes
  • 3 in conversation