I am trying to get the login data for users displayed per day.
So I want the first login only for that user and I want it per day (users log in multiple times per day and so I only want their first login for that day), I have been struggling to get it per day, any help would be much appreciated.
Since you did not provide your data in a data step as requested, I have to assume that your date and time column are actually SAS dates and times.
So my previous suggestion turns into this code:
proc sort data=have;
by user date time;
run;
data want;
set have;
by user date;
if first.date;
run;
@RALL wrote:
I am trying to get the login data for users displayed per day.
So I want the first login only for that user and I want it per day (users log in multiple times per day and so I only want their first login for that day), I have been struggling to get it per day, any help would be much appreciated.
Sample data please, in the form of a data step. Followed by the desired result, also in the form of a data step.
Help us help you.
Editted note: Let's assume your data are sorted chronologically, so each userid appears scattered throughout the data set. This is a case where the efficiency of hash object shines:
data _null_;
if 0 then set have;
declare hash h (dataset:'have',ordered:'A');
h.definekey('userid');
h.definedata(all:'Y');
h.definedone();
h.output(dataset:'want');
run;
This works because:
The ordered:'A' parameter tells sas to maintain the hash object in ascending order by the key (userid). I.e. your want dataset will be sorted by userid, with one record per userid. This can be a good deal more efficient than sorting the original unreduced dataset file by userid.
Sort by user, date and time, then run a data step with BY user date, and use a subsetting if first.date.
For detailed help, supply example data in a data step with datalines, against which we can write the code.
Date | Time (UTC) | User |
6/05/2020 | 03:57.9 | GG |
6/05/2020 | 03:57.4 | DD |
6/05/2020 | 03:57.2 | AA |
6/05/2020 | 03:57.1 | HH |
6/05/2020 | 03:57.0 | AA |
6/05/2020 | 03:56.8 | II |
6/05/2020 | 03:56.8 | AA |
6/05/2020 | 03:56.7 | BB |
6/05/2020 | 03:56.4 | DD |
6/05/2020 | 03:56.1 | HH |
6/05/2020 | 03:55.9 | AA |
6/05/2020 | 03:55.7 | GG |
6/05/2020 | 03:55.5 | FF |
6/05/2020 | 03:55.4 | CC |
6/05/2020 | 03:55.1 | EE |
6/05/2020 | 03:55.0 | DD |
6/05/2020 | 03:54.8 | CC |
6/05/2020 | 03:54.8 | BB |
6/05/2020 | 03:54.7 | AA |
please try the below code
proc sort data=have;
by user date time;
run;
data want;
set have;
by user date time;
if first.user;
run;
Date (UTC) |
User | Username |
2020-08-31:20:46:49 |
I combined my date and time columns to make the column DATE (UTC) as above and ran the code as follows (they where separate date and time columns) ;
data ral.Sigins_dups_Removed1;
set ral.SIGNINS_COMBINED;
by username 'Date (UTC)'n;
if first.username;
run;
It is not giving me the the first sigin by day anymore, any idea what I need to change?
Since you did not provide your data in a data step as requested, I have to assume that your date and time column are actually SAS dates and times.
So my previous suggestion turns into this code:
proc sort data=have;
by user date time;
run;
data want;
set have;
by user date;
if first.date;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.