05-15-2018 12:01 PM
I have a table that contains order ID numbers.
When an order is updated, the observation for that order isn't updated. Instead, a new observation is inserted into the dataset so there would be two observations for a single order.
There is a variable that identifies the iteration of the observation for a particular order ID. So, the initial order will be iteration 1. If an order is updated, there will be a 2nd record that that order where the iteration number is 2.
How would I query this table to return only observations with the highest iteration number for a each order id?
05-15-2018 12:14 PM
Well, wouldn't you be wanting to query and filer on getting max of the dates for the order id's? To me this makes sense to get the latest inserted row considering the latest insert would have the most recent date, right?
05-15-2018 02:35 PM
Hi novinosrin. I considered this but didn't think it would work because it would only return the observations the same as the newest iteration date in the dataset. Am i correct in assuming this?
For example - if the newest iteration date found in the dataset is 1/12/2018, then i would only get records with an order date on 1/12/2018. This isn't what I want because some of the latest iteration of some observations wouldn't necessarily be the same as the maximum iteration date in the dataset.
05-15-2018 12:17 PM - edited 05-15-2018 12:45 PM
proc sql; create table want as select * from have group by order_id having iteration_number=max(iteration_number); quit;