BookmarkSubscribeRSS Feed
Truc
Calcite | Level 5

Hi all

 

I am trying to find the number of females who have entered the shop within the past month with the data I have. I have written the code but somehow it doesn't seem to work. Your help is greatly appreciated.

 

Let me describe my data. I have 2 datasets:

Data1: this data has the shop id and date  and looks something like this

Shop_id             Date

1                        23/3/2015

1                        1/4/2015

2                        1/3/2015

2                         1/5/2015

 

Data2: this data has the shop id, the date of customer visits, the customer id and the gender of customer (0 for male and 1 for female)

Shop_id             Date                     Cust_id              Gender

1                         1/3/2015               113                     1

1                         1/4/2015               115                     0

2                         3/3/2015               116                     1

 

I want to create a new dataset, data3, which has the shop_id, the date and the number of female customers who have visited the shop in the past 1 month (prior to the date). For example, in row 1, i want to insert a new column, no_female, which is the number of females who have visited the shop 1 from 23/2/2015 to 22/3/2015. 

 

This is the code I write, but it doesn't seem to work:

 

Proc sql;

Create table data3 as

Select a.*,sum(b.gender) as no_female

from data1 as a, data2 as b

where a.shop_id = b.shop_id and -30 <= intck('day',a.date,b.date) < 0

group by a.shop_id, a.date;

quit;

 

Your help is greatly appreciated!! Thanks!!!

2 REPLIES 2
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Something like the below, note that it is not optimal coding, it is more to describe the thinking behind it:

data have1;
  informat date date9.;
  input Shop_id Date;
datalines;
1 23MAR2015
1 01APR2015
2 01MAR2015
2 01MAY2015
;
run;

data have2;
  informat date date9.;
  input Shop_id Date Cust_id Gender;
datalines;
1 01MAR2015 113 1
1 01APR2015 115 0
2 03MAR2015 116 1
;
run;

proc sql;
  create table WANT as
  select  distinct  SHOP_ID
                    ,DATE
                    ,count(CUST_ID) as RESULT
  from  (select A.*
                ,B.CUST_ID
         from   HAVE1 A
         left join (select * from HAVE2 where GENDER=1) B
         on     A.SHOP_ID=B.SHOP_ID
         and    intnx('month',A.DATE,-1) <= B.DATE <= A.DATE)
  group by SHOP_ID,DATE;
quit;

Ok, so in the subquery, the from (), I join all female records back to the main dataset based on date being between date-1 month and date.  Then in the main query I count these using a by group.

ballardw
Super User

First thing: make sure your dates as SAS date valued variables.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 388 views
  • 0 likes
  • 3 in conversation