06-13-2016 08:06 AM
Can anyone tell me whether there's an improved way that I could code this in SAS?
Background to my data: my dataset has one row per individual.
For each individual I have their ID, and the date on which they enrolled on a course and the date on which they finished.
Every individual has a JOINDATE, but not every individual has a LEFTDATE.
Each individual can only appear once in the dataset, however they do not leave in the same order in which they joined, e.g. data like this:
Person1, JOINDATE=01may16 LEFTDATE=30may16
Person2, JOINDATE=02may16 LEFTDATE=.
Person3, JOINDATE=03may16 LEFTDATE=25may16
I want to create an output table showing for every Friday during the past year, how many people were on the course on that date.
At the moment I have some macro code which creates a variable for every Friday in the year.
For example for Fri6th June my variable is called PRESENT_06Jun16
I then have set of If-then statement to check whether PRESENT_06Jun16 is true or false for each individual:
e.g. if JOINDATE le '06jun16'd and (LEFTDATE > '06jun16'd OR LEFTDATE=.) then PRESENT_06Jun16=true; else PRESENT_06Jun16=false.
So I'm creating 52 extra variables in this dataset, one for every week, which I'm then tabulating. This seems a really long winded way to get the table that I'm after. Can anyone help with an improved method to do this?
I know that another way to get the same output would be to calculate the cumulative number of people who had joined up to date X and the cumulative number of people who had left up to date X, and subtracting one from the other would give me the number who must be present on date X - but I've not found a handy way to do this method in SAS either. (I want to create my output in SAS and not have to export two tables to Excel and subtract one table from the other in there if you see what I mean.)
Thanks for any help! (I'm using SAS v9.3)
06-13-2016 08:24 AM
As a suggestion, to get a good response to your questions, post test data, in the form of a datastep, and what the output should look like. I have made several assumptions in the code I present below. It basically creates a dataset with a weekly date, then merges your data onto that, and sums up by the weekly date:
data have; Person=1; JOINDATE="01may2016"d; LEFTDATE="30may2016"d; output; Person=2; JOINDATE="02may2016"d; LEFTDATE=.; output; Person=3; JOINDATE="03may2016"d; LEFTDATE="25may2016"d; output; run; data have2 (drop=i); do i=0 to (51*7) by 7; date="06Jun2016"d + i; output; end; format date date9.; run; proc sql; create table WANT as select DATE, count(PERSON) as RESULT from ( select A.DATE, B.PERSON from HAVE2 A left join HAVE B on B.JOINDATE <= A.DATE <= max(today(),B.LEFTDATE) ) group by DATE; quit;