I keep getting the following error:
83 data H_A_CASE (keep=key seq_id create_dttm admit_dt dschg_dt ntfy_dttm status_cd);
84 set rda.H_A_CASE_V;
85 where (datepart(create_dttm) between &start. and &end.) and status_cd not in ('5' '6') ;
SYMBOLGEN: Macro variable START resolves to '01JUL2023'd
SYMBOLGEN: Macro variable END resolves to '06JUL2024'd
86 run;
NOTE: There were 19264049 observations read from the data set rda.H_A_CASE_V.
WHERE (DATEPART(create_dttm)>='01JUL2023'D and DATEPART(create_dttm)<='06JUL2024'D) and status_cd not in ('5', '6');
NOTE: The data set WORK.H_A_CASE has 19264049 observations and 7 variables.
NOTE: Compressing data set WORK.H_A_CASE increased size by 10.95 percent.
Compressed is 23540 pages; un-compressed would require 21216 pages.
Block Output Operations 0
87
88
89 data H_A_TRANS_COMM (keep=key trans_id input_dttm);
90 set rda.H_A_TRANS_COMM_V ;
91 where (datepart(input_dttm) between &start. and &end.) ;
SYMBOLGEN: Macro variable START resolves to '01JUL2023'd
SYMBOLGEN: Macro variable END resolves to '06JUL2024'd
92 run;
NOTE: There were 152633956 observations read from the data set RDA.H_A_TRANS_COMM_V.
WHERE (DATEPART(input_dttm)>='01JUL2023'D and DATEPART(input_dttm)<='06JUL2024'D);
NOTE: The data set WORK.H_A_TRANS_COMM has 152633956 observations and 3 variables.
NOTE: Compressing data set WORK.H_A_TRANS_COMM increased size by 31.91 percent.
Compressed is 123366 pages; un-compressed would require 93526 pages.
93
94
95 proc sort data=H_A_TRANS_COMM out=H_A_TRANS_COMM2;
96 by key trans_id descending input_dttm;
97 run;
NOTE: There were 152633956 observations read from the data set WORK.H_A_TRANS_COMM.
NOTE: SAS threaded sort was used.
NOTE: The data set WORK.H_A_TRANS_COMM2 has 152633956 observations and 3 variables.
NOTE: Compressing data set WORK.H_A_TRANS_COMM2 increased size by 31.91 percent.
Compressed is 123366 pages; un-compressed would require 93526 pages.
NOTE: PROCEDURE SORT used (Total process time):
98
99 /*want max input_dttm*/
100 data out.trans_comm ;
101 set H_A_TRANS_COMM2 ;
102 by key trans_id input_dttm;
103 if first.input_dttm;
104
105 run;
ERROR: BY variables are not properly sorted on data set WORK.H_A_TRANS_COMM2.
KEY=H-123456789 INPUT_DTTM=04OCT2023:23:42:53 TRANS_ID=. FIRST.KEY=0 LAST.KEY=1
FIRST.INPUT_DTTM=1 LAST.INPUT_DTTM=1 _ERROR_=1 _N_=4
These two BY statements are not the same
by key trans_id descending input_dttm; /* As seen in PROC SORT */
by key trans_id input_dttm; /* As seen in the DATA step */
These two BY statements are not the same
by key trans_id descending input_dttm; /* As seen in PROC SORT */
by key trans_id input_dttm; /* As seen in the DATA step */
Thank you. How would I pick the first input_dttm for each key? Because I still get multiple input_dttm values listed for the same key values when I just need the latest for each key.
123456789 03JUL2023 12:00:33
123456789 06JUL2023 11:09:45
Thank you
@bhca60 wrote:
Thank you. How would I pick the first input_dttm for each key? Because I still get multiple input_dttm values listed for the same key values when I just need the latest for each key.
123456789 03JUL2023 12:00:33
123456789 06JUL2023 11:09:45
Thank you
What happened to TRANS_ID? Can you provide more details about what you are asking for?
123456789 03JUL2023 12:00:33 9806666
123456789 06JUL2023 11:09:45 8909383
The trans ids are different
@bhca60 wrote:
123456789 03JUL2023 12:00:33 9806666
123456789 06JUL2023 11:09:45 8909383
The trans ids are different
And so WHAT does that mean in terms of your actual problem?
Do you want the last observation per KEY value? If so do you want the values sorted only be the datetime value? Or do you want them sorted by the TRANS_ID value? Or both? If both which takes precedence in the ordering?
Let's make a little dataset where it makes a difference which variables you sort by.
data have;
input key $ input_dttm :datetime. trans_id $ ;
format input_dttm datetime19.;
cards;
A 03JUL2023:12:00:33 ZZZZ
A 06JUL2023:11:09:45 AAAA
;
Now let's try it both ways and see how the results differ.
proc sort data=have;
by key input_dttm trans_id;
run;
data test1;
set have;
by key input_dttm trans_id;
if last.key;
run;
proc sort data=have;
by key trans_id input_dttm;
run;
data test2;
set have;
by key trans_id input_dttm;
if last.key;
run;
proc compare data=test1 compare=test2;
run;
So which observation do you want to select??
@bhca60 wrote:
123456789 03JUL2023 12:00:33 9806666
123456789 06JUL2023 11:09:45 8909383
The trans ids are different
@bhca60 I specifically asked in my previous post "Can you provide more details about what you are asking for?" but you didn't answer that.
So where does that leave us? You asked a question about sorting which was answered. Then you went ahead and gave us a different problem with the same (or similar) data, that has nothing to do with your sorting problem, and you gave us a minimal description of this different problem. It's essentially a new problem, please take some time to describe the problem fully and describe in detail what you want.
Not clear what you are asking for but the code you posted was NOT picking the first datetime value. It was making sure the datetime values where unique.
If you want the maximum datetime value, within some other set of key variables, then do something like:
proc sort data=have;
by a b datetime_var;
run;
data want;
set have;
by a b datetime_var;
if last.b;
run;
So you get one observation per A*B combination and it has the maximum value of DATETIME_VAR.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9.
Lock in the best rate now before the price increases on April 1.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.