BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
RALL
Obsidian | Level 7

I am trying to get the login data for users displayed per day.

 

So I want the first login only for that user and I want it per day (users log in multiple times per day and so I only want their first login for that day), I have been struggling to get it per day, any help would be much appreciated. 

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Since you did not provide your data in a data step as requested, I have to assume that your date and time column are actually SAS dates and times.

So my previous suggestion turns into this code:

proc sort data=have;
by user date time;
run;

data want;
set have;
by user date;
if first.date;
run;

View solution in original post

6 REPLIES 6
mkeintz
PROC Star

@RALL wrote:

I am trying to get the login data for users displayed per day.

 

So I want the first login only for that user and I want it per day (users log in multiple times per day and so I only want their first login for that day), I have been struggling to get it per day, any help would be much appreciated. 


Sample data please, in the form of a data step.  Followed by the desired result, also in the form of a data step.

 

Help us help you.

 

Editted note:  Let's assume your data are sorted chronologically, so each userid appears scattered throughout the data set.  This is a case where the efficiency of hash object shines:

 

data _null_;
  if 0 then set have;
  declare hash h (dataset:'have',ordered:'A');
    h.definekey('userid');
    h.definedata(all:'Y');
    h.definedone();
  h.output(dataset:'want');
run;

This works because:

  1. By default, the hash object (think lookup table) will keep only one dataitem  (i.e. one record) per key (i.e. per userid).
  2. Also by default, once a given key is in the hash it is NOT replaced when another record for the same userid is encountered.  That is, it keeps the first instance of each userid.

The ordered:'A' parameter tells sas to maintain the hash object in ascending order by the key (userid).  I.e. your want dataset will be sorted by userid, with one record per userid.  This can be a good deal more efficient than sorting the original unreduced dataset file by userid. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

Sort by user, date and time, then run a data step with BY user date, and use a subsetting if first.date.

For detailed help, supply example data in a data step with datalines, against which we can write the code.

RALL
Obsidian | Level 7
DateTime (UTC)User
6/05/202003:57.9GG
6/05/202003:57.4DD
6/05/202003:57.2AA
6/05/202003:57.1HH
6/05/202003:57.0AA
6/05/202003:56.8II
6/05/202003:56.8AA
6/05/202003:56.7BB
6/05/202003:56.4DD
6/05/202003:56.1HH
6/05/202003:55.9AA
6/05/202003:55.7GG
6/05/202003:55.5FF
6/05/202003:55.4CC
6/05/202003:55.1EE
6/05/202003:55.0DD
6/05/202003:54.8CC
6/05/202003:54.8BB
6/05/202003:54.7AA
Jagadishkatam
Amethyst | Level 16

please try the below code

 

proc sort data=have;
by user date time;
run;

data want;
set have;
by user date time;
if first.user;
run;
Thanks,
Jag
RALL
Obsidian | Level 7

Date (UTC)

User Username
2020-08-31:20:46:49

I combined my date and time columns to make the column DATE (UTC) as above and ran the code as follows (they where separate date and time columns) ;

 

data ral.Sigins_dups_Removed1;
set ral.SIGNINS_COMBINED;
by username 'Date (UTC)'n;
if first.username;
run;

 

It is not giving me the the first sigin by day anymore, any idea what I need to change?

Kurt_Bremser
Super User

Since you did not provide your data in a data step as requested, I have to assume that your date and time column are actually SAS dates and times.

So my previous suggestion turns into this code:

proc sort data=have;
by user date time;
run;

data want;
set have;
by user date;
if first.date;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 813 views
  • 3 likes
  • 4 in conversation