Desktop productivity for business analysts and programmers

Find the last date for each registration

Reply
Contributor
Posts: 51

Find the last date for each registration

I have a large dataset that features multiple rows per User.  Each row has a date that corresponds to the period where data was collected from the user.  If the User stops needing to send data that ends up in this dataset the date column records 'END' rather than a date.  I odn't know why, but that's the way it is.

 

I want to use a computed column to recode the dates to return the date if there's a date, but if there isn't a date to return the latest date against the appropriate User.  

 

I can do the CASE WHEN bit, but I don't know how to pick out the MAX date for each User?  I'm sure it's really easy, but I've not done it before and I'm struggling to know what to google for!

 

Any help would be appreciated!

 

Thanks

 

Paul.

Frequent Contributor
Posts: 117

Re: Find the last date for each registration

can you provide sample input data and ouput you required.

Esteemed Advisor
Posts: 6,646

Re: Find the last date for each registration

So the "date" variable is of type character and contains either a valid date or the "END" string.

I'd first convert the date variable into a real SAS date, and for "END" I'd set an artificial high value (9999-12-31).

Then sort by user and date.

Then a data step like this:

data want;
set have;
retain keep_date;
if not first.user and date = '31dec9999'd then date = keep_date;
keep_date = date;
drop keep_date;
run;

It is necessary that at least the first entry for a user contains a valid date.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Contributor
Posts: 51

Re: Find the last date for each registration

Hello, thanks that looks like it should work.  I can't follow the SQL well enough to know that it will definitely work, but it looks like a good starting point.

 

I've actually solved the problem by creating a second Query Builder that takes only the User ID and the MAX of the associated dates, then joining this table back to the original table.  I've then used a CASE WHEN statement to substitute these dates where the date is 'END'.

Frequent Contributor
Posts: 117

Re: Find the last date for each registration

I am little surprised how this date value is 'END' ..is it character variable in the dataset?

Contributor
Posts: 51

Re: Find the last date for each registration

It's data that comes from a legacy system, I've no idea how it ends up that way, or why as the final record would still have a date which is then overwritten.  

 

Presumably someone somewhere in the past didn't specify the ability to record that a record would be the last one and someone else decided it was more important to know that than the date of the record.  In the context of what the data is used for and the age of the system I can sort of understand that, though it's not ideal!

Grand Advisor
Posts: 10,210

Re: Find the last date for each registration

I don't know how you are reading this data into SAS but if you are using a data step you might consider adjusting the process to use a custom format to handle this.

 

proc format library=work;
invalue stoopiddate (upcase)
'END' = '21DEC9999'd
other = [mmddyy10.];
run;

data example;
   informat date stoopiddate.;
   input date;
   format date date9.;
datalines;
01/01/2016
02/02/2016
03/03/2016
end
;
run;

sets the "END" value to a large date. OR use a custom missing such as .E and a format that would display .E as "END". I don't have any clue how you may use the resulting value so either approach may be useful.

 

Ask a Question
Discussion stats
  • 6 replies
  • 164 views
  • 0 likes
  • 4 in conversation