Solved: Calculate number of logins over days and select records with at least ... - Page 2

PharmlyDoc · Posted 06-18-2021 09:23 AM

Bingo!

proc sort data=want;
by userid;
run;

proc sort data=allrecords;
by userid;
run;

data combine;
merge allrecords(in=a) want(in=b);
by userid;
if a;
if a and b then flag=1;
else flag=0;
run;

proc print data=combine;
run;

tarheel13 · Posted 06-18-2021 09:25 AM

Here is an example of inexact matching using SQL. Could have used logic like this to find the logins within 30 days. I have SQL notes if anyone wants them. Personally, I am a huge fan of SQL and find it advantageous to know it. I think I got the logic wrong in my query but this would have been how to do it. It is nice to be able to think of more than one solution to programming problems.

TITLE 'Example of a match on inexact values
PROC SQL; SELECT a.pt_id, a.admdate LABEL="Admission", b.admdate LABEL="Re-admission within 35 days"
FROM ex.admits AS a, ex.admits AS b
WHERE a.pt_id = b.pt_id AND /* match on patient ID */ a.admdate < b.admdate AND /* only if admit before re-admit */ b.admdate <= (a.admdate+35); /* only if re-admit w/in 35 days */

PharmlyDoc · Posted 06-18-2021 04:40 PM

I'm actually not getting what I believe to be correct from this.

The tables that I'm using have thousands of users. And for some reason when I merge the want table with the allrecords table, the resulting combine table has more distinct userids than the want table, which doesn't make sense. The numbers should be the same or less.

tarheel13 · Posted 06-18-2021 04:42 PM

Well, the code I gave you was assuming that want and all_records are both at 1 row per userid. Are they both not at user-id level?

PharmlyDoc · Posted 06-18-2021 07:39 PM

The final derived want dataset has one userid per row (no duplicate user id's), likewise all_records has 1 row per user id (no duplicates). Both of these tables have multiple columns but each table has only one column that contains userid's.

tarheel13 · Posted 06-18-2021 08:16 PM

I was asking if it was 1 row per userid. If it’s not, then that might be why it’s at the wrong level.

PharmlyDoc · Posted 06-18-2021 09:18 PM

Yes, each userid only has one row associated with it per table.

This SQL code does the same thing as the data step + merge statement, but the SQL takes much longer to run. I'll dig back in at the code tomorrow.

proc sql noprint;
select s.*, (case when r.userid is null then 0 else 1 end) as flag
from allrecords s left join
want r
on r.userid = s.userid;
quit;

tarheel13 · Posted 06-18-2021 09:21 PM

Well, we haven’t seen your program but I tested the proc sort and merge with the datasets in this thread and got the desired results you described. can help better if we see code or you post an example of output that isn’t what you expected.

PharmlyDoc · Posted 06-19-2021 02:38 PM

@andreas_lds

@tarheel13

Everything comes out correctly with my real dataset with the following code:

proc summary data=work.records nway;
   class userid login_datetime;
   format login_datetime dtmonyy7.;
   output out=work.counted(drop=_type_ rename=(_freq_ = count));
run;

For some reason SAS gives me duplicate userids when I use the keep userid; statement with my real dataset.

But the following code appears to work and gives me the results that I got with the test data that I included in this post.

data work.want;
   set work.counted;
   by userid;
   
   retain consec_month last_login;
   
   login_date = intnx('month', datepart(login_datetime), 0, 'b');
   
   if first.userid then do;
      consec_month = 1;
   end;
   else do; 
      if intck('month', last_login, login_date) = 1 then do;
         consec_month = consec_month + 1;
      end;
      else do;
         consec_month = 0;
      end;   
   end;
  
   last_login = login_date;
  
   if consec_month >= 4 ;
/* deleted  keep userid;  */
run;

proc print data=want;
run;

proc sql ;
  create table userid_active as
    select distinct(userid) as USER_ID
      from want
  ;
quit;

andreas_lds · Posted 06-22-2021 12:57 AM

@PharmlyDoc wrote:

@andreas_lds

@tarheel13

Everything comes out correctly with my real dataset with the following code:
proc summary data=work.records nway;
   class userid login_datetime;
   format login_datetime dtmonyy7.;
   output out=work.counted(drop=_type_ rename=(_freq_ = count));
run;
For some reason SAS gives me duplicate userids when I use the keep userid; statement with my real dataset.

But the following code appears to work and gives me the results that I got with the test data that I included in this post.
data work.want;
   set work.counted;
   by userid;
   
   retain consec_month last_login;
   
   login_date = intnx('month', datepart(login_datetime), 0, 'b');
   
   if first.userid then do;
      consec_month = 1;
   end;
   else do; 
      if intck('month', last_login, login_date) = 1 then do;
         consec_month = consec_month + 1;
      end;
      else do;
         consec_month = 0;
      end;   
   end;
  
   last_login = login_date;
  
   if consec_month >= 4 ;
/* deleted  keep userid;  */
run;

The duplicates are a result of the change in the subsetting-if you implemented.

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

Re: Calculate number of logins over days and select records with at least 1 login per month?

SAS Innovate 2025: Register Now

SAS Training: Just a Click Away