Hi
I Am looking for the highlighted data to be extracted for the following data
Identifier_LHS MAX_of_Date_Loaded_LHS Status_LHS Work_Status_LHS MAX_of_MAX_of_Version
153553 24MAR18:18:31:58 Clear Unworked 1
153553 24MAR18:18:32:16 Clear Unworked 2
153553 24MAR18:18:32:17 Clear Unworked 3
153553 24MAR18:18:32:20 Clear Worked 4
14297488 27FEB18:16:56:53 Suspect Worked 2
14297488 28MAR18:16:39:42 Suspect Unworked 3
Tried using the max of date and max of version ...but not getting the desired o/p
sounds like basic sql exercise
data have;
input Identifier_LHS MAX_of_Date_Loaded_LHS :datetime20. Status_LHS $ Work_Status_LHS :$15. MAX_of_MAX_of_Version;
format MAX_of_Date_Loaded_LHS datetime20.;
datalines;
153553 24MAR18:18:31:58 Clear Unworked 1
153553 24MAR18:18:32:16 Clear Unworked 2
153553 24MAR18:18:32:17 Clear Unworked 3
153553 24MAR18:18:32:20 Clear Worked 4
14297488 27FEB18:16:56:53 Suspect Worked 2
14297488 28MAR18:16:39:42 Suspect Unworked 3
;
proc sql;
create table want as
select *
from have
where Work_Status_LHS='Unworked'
group by Identifier_LHS
having MAX_of_Date_Loaded_LHS=max(MAX_of_Date_Loaded_LHS);
quit;
What is the rule that tells us the data contains the correct record?
Your statement of max/min not working indicates the datatime alone is insufficient (not that the highlighted first record was either a max or min for the identifier). Also "no the status can vary ...to be unworked,clear,worked,etc." does not tell what the selection rule might be.
Would it by any chance be that you want MAX_of_MAX_of_Version=3 only?
Identifier_LHS Date_Loaded_LHS Status_LHS Work_Status_LHS MAX_of_Version Calculation
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
want to fetch the latest record with latest date and version is greater max.
but there are cases like record highlighted in blue where the date and time are the same.
so after using the max function for date and version it dose not work.
why not just apply a simple distinct?
data have;
input Identifier_LHS Date_Loaded_LHS :datetime20. Status_LHS $ Work_Status_LHS :$15. MAX_of_Version Calculation;
format Date_Loaded_LHS datetime20.;
datalines;
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
;
proc sql;
create table want as
select distinct Identifier_LHS ,Date_Loaded_LHS ,Status_LHS, Work_Status_LHS ,MAX_of_Version ,Calculation
from have
group by Identifier_LHS
having Date_Loaded_LHS=max(Date_Loaded_LHS);
quit;
I tested my code with the sample you gave and it seems to work fine
here is the log:
113 proc sql;
114 create table want as
115 select distinct Identifier_LHS ,Date_Loaded_LHS ,Status_LHS,
115! Work_Status_LHS ,MAX_of_Version ,Calculation
116 from have
117 group by Identifier_LHS
118 having Date_Loaded_LHS=max(Date_Loaded_LHS);
NOTE: The query requires remerging summary statistics back with the
original data.
NOTE: Table WORK.WANT created, with 2 rows and 6 columns.
119 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
You apparently want one record per id at latest datetime. Then what if you have this data, with tied records at maximum datetime 24MAR18:18:32:20?
153553 24MAR18:18:32:20 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:20 Clear Unworked 5 1
These records have distinct values in other fields. Then what is your rule for choosing a record? The proc sql "select distinct" approach would yield all three. Is that what you want?
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.