DATA Step, Macro, Functions and more

Hash table calculations?

Reply
Frequent Contributor
Posts: 81

Hash table calculations?

Hi,

I have a table that tracks Users and LoginDates

UserID LoginDate

1 '02Jan2015'

1 '03Jan2015'

1 '22Jan2015' 

...

 

I am trying to create a table where I evaluate at each month end reporting period to calculate the activity of a user into the following categories:

Daily User: 15+ days in a month

Else Weekly User: <15days in a month and once every calendar week

Else Monthly User: atleast once in current month

Else Quarterly User: atleast once in last 3 months

Else Annual User atleast once in last year

Else Inactive - no activity in 1 year

 

Because my fact table only capture activity, i'm not sure how to go about evaluating at each month end period to create a table i can use to report on that may look like:

UserID ReportDate Status

1 '31Jan2015' Weekly

1 '29Feb2015' Daily

1 '31Mar2014' Monthly

etc.

 

Every month the total should reflect my entire User base at that period of time (it is additive so every month it will greater than the previous.

 

Thanks in advance for your help!

Super User
Super User
Posts: 7,407

Re: Hash table calculations?

Should be a simple SQL step, and use intnx() to get the date intervals e.g. (this should start you off, without test data (in the form of a datastep) and the fact that I am leaving now prevents further code):

data have;
  userid=1; logindate="02jan2015"d; output;
  userid=1; logindate="03jan2015"d; output;
  userid=1; logindate="22jan2015"d; output;
run;

proc sql;
  create table TMP as
  select  USERID,
          max(LOGINDATE) as LAST_LOGIN format=date9.,
          case  when CALCULATED LAST_LOGIN < intnx("year",today(),-1) then "Over 1 year"
                when intnx("day",today(),-15) <= CALCULATED LAST_LOGIN <= today() then "Within 15 days"
                ...  /* for current month, last 3 months etc. */
                when intnx("year",today(),-1) <= CALCULATED LAST_LOGIN <= today() then "Within 1 year"
                else "" end as RESULT
  from    HAVE
  group by USERID;
quit;
Frequent Contributor
Posts: 81

Re: Hash table calculations?

Thank you for your quick response.
This solution will give me values where there is activity in the fact table. I would like to evaluate and output a record for a user even if they do not have activity in the month of evaluation.
Super User
Super User
Posts: 7,407

Re: Hash table calculations?

[ Edited ]

Create a list of the values you want then join the other onto that:

proc sql;
  create table TMP as
  select  *
  from    (select distinct USERID from HAVE) A
  left join (
  select  USERID,
          max(LOGINDATE) as LAST_LOGIN format=date9.,
          case  when CALCULATED LAST_LOGIN < intnx("year",today(),-1) then "Over 1 year"
                when intnx("day",today(),-15) <= CALCULATED LAST_LOGIN <= today() then "Within 15 days"
                ...  /* for current month, last 3 months etc. */
                when intnx("year",today(),-1) <= CALCULATED LAST_LOGIN <= today() then "Within 1 year"
                else "" end as RESULT
  from    HAVE
  group by USERID) B
  on      A.USERID=B.USERID;
quit;

Or you could do it in a datastep:

proc sort data=have out=want;
  by userid descending logindate;  /* Note the descending keyword - this puts latest first */
run;

data want;
  set want;
  by userid;
  if first.userid then do;
    if intnx('year',today(),-1) < logindate then result="A year before";
    if intnx('day',....
    ...
    output;
  end;
run;

This will output only the first record for each userid, and the if's will populate the flag.  Probably more resource friendly doing it in the datastep if you have lots of data.

 

 

 

 

 

 

 

Ask a Question
Discussion stats
  • 3 replies
  • 196 views
  • 0 likes
  • 2 in conversation