BookmarkSubscribeRSS Feed
anirudhs
Obsidian | Level 7

Hi 

I Am looking for the highlighted data to be extracted for the following data 

 

Identifier_LHS MAX_of_Date_Loaded_LHS Status_LHS Work_Status_LHS MAX_of_MAX_of_Version
153553 24MAR18:18:31:58 Clear Unworked 1
153553 24MAR18:18:32:16 Clear Unworked 2
153553 24MAR18:18:32:17 Clear Unworked 3
153553 24MAR18:18:32:20 Clear Worked 4
14297488 27FEB18:16:56:53 Suspect Worked 2
14297488 28MAR18:16:39:42 Suspect Unworked 3

 

Tried using the max of date and max of version ...but not getting the desired o/p

9 REPLIES 9
novinosrin
Tourmaline | Level 20

sounds like basic sql exercise 

data have;
input Identifier_LHS MAX_of_Date_Loaded_LHS :datetime20. Status_LHS $ Work_Status_LHS :$15. MAX_of_MAX_of_Version;
format MAX_of_Date_Loaded_LHS datetime20.;
datalines;
153553 24MAR18:18:31:58 Clear Unworked 1
153553 24MAR18:18:32:16 Clear Unworked 2
153553 24MAR18:18:32:17 Clear Unworked 3
153553 24MAR18:18:32:20 Clear Worked 4
14297488 27FEB18:16:56:53 Suspect Worked 2
14297488 28MAR18:16:39:42 Suspect Unworked 3
;

proc sql;
create table want as
select *
from have
where Work_Status_LHS='Unworked'
group by Identifier_LHS
having MAX_of_Date_Loaded_LHS=max(MAX_of_Date_Loaded_LHS);
quit;
anirudhs
Obsidian | Level 7
no the status can vary ...to be unworked,clear,worked,etc.
ballardw
Super User

What is the rule that tells us the data contains the correct record?

Your statement of max/min not working indicates the datatime alone is insufficient (not that the highlighted first record was either a max or min for the identifier). Also "no the status can vary ...to be unworked,clear,worked,etc." does not tell what the selection rule might be.

 

Would it by any chance be that you want  MAX_of_MAX_of_Version=3 only?

anirudhs
Obsidian | Level 7

Identifier_LHS Date_Loaded_LHS Status_LHS Work_Status_LHS MAX_of_Version Calculation
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0

 

want to fetch the latest record with latest date and version is greater max.

 

but there are cases like record highlighted in blue where the date and time are the same. 

so after using the max function for date and version it dose not work. 

 

novinosrin
Tourmaline | Level 20

why not just apply a simple distinct?

 

data have;
input Identifier_LHS Date_Loaded_LHS :datetime20. Status_LHS $ Work_Status_LHS :$15. MAX_of_Version Calculation;
format Date_Loaded_LHS datetime20.;
datalines;
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked 4 1
153553 24MAR18:18:32:17 Clear Unworked 4 1
153553 24MAR18:18:32:16 Clear Unworked 4 1
153553 24MAR18:18:31:58 Clear Unworked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 27FEB18:16:56:53 Suspect Worked 4 1
14297488 28MAR18:16:39:42 Suspect Unworked 4 0
;

proc sql;
create table want as
select distinct Identifier_LHS ,Date_Loaded_LHS ,Status_LHS, Work_Status_LHS ,MAX_of_Version ,Calculation
from have
group by Identifier_LHS
having Date_Loaded_LHS=max(Date_Loaded_LHS);
quit; 
anirudhs
Obsidian | Level 7
yes even if i apply distinct then i am getting the following output:
Identifier_LHS MAX_of_Date_Loaded_LHS Status_LHS Work_Status_LHS MAX_of_MAX_of_Version
153553 24MAR18:18:31:58 Clear Unworked 1
153553 24MAR18:18:32:16 Clear Unworked 2
153553 24MAR18:18:32:17 Clear Unworked 3
153553 24MAR18:18:32:20 Clear Worked 4
14297488 27FEB18:16:56:53 Suspect Worked 2
14297488 28MAR18:16:39:42 Suspect Unworked 3

novinosrin
Tourmaline | Level 20

I tested my code with the sample you gave and it seems to work fine

here is the log:

 


113 proc sql;
114 create table want as
115 select distinct Identifier_LHS ,Date_Loaded_LHS ,Status_LHS,
115! Work_Status_LHS ,MAX_of_Version ,Calculation
116 from have
117 group by Identifier_LHS
118 having Date_Loaded_LHS=max(Date_Loaded_LHS);
NOTE: The query requires remerging summary statistics back with the
original data.
NOTE: Table WORK.WANT created, with 2 rows and 6 columns.

119 quit;
NOTE: PROCEDURE SQL used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

mkeintz
PROC Star

You apparently want one record per id at latest datetime.  Then what if you have this data, with tied records at maximum datetime 24MAR18:18:32:20?

 

153553 24MAR18:18:32:20 Clear Unworked 4 1
153553 24MAR18:18:32:20 Clear Worked   4 1
153553 24MAR18:18:32:20 Clear Unworked 5 1

 

These records have distinct values in other fields.  Then what is your rule for choosing a record?  The proc sql "select distinct" approach would yield all three.  Is that what you want?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
anirudhs
Obsidian | Level 7
thats what the querry is so we taking max date a max version would help . i guess

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 890 views
  • 0 likes
  • 4 in conversation