Sorry to bother, but if anyone has a suggestion I'd be very grateful! I know I've recieved a lot of help here before, and I'm hoping someone can let me know what's happening here.
I have a dataset that looks like the following
ID, MonthYear_r2, ODED_date1, ODHD_date1, ODOO_date1
1, 1, , ,
1, 2, , ,
1, 3, 3060, ,
1, 4, , ,
2, 1, , 3040,
2, 2, ,
3, 1, , 2945, 2965
3, 2, , ,
If this makes sense. There's an ID, months, and then numbers representing the date of an event. I'm looking to identify the maximum number across the three event columns (ODED_date1, ODHD_date1, and ODOO_date1) by making an indicator variable. Then use this indicator to select the following 24 months after this event to keep in a dataset.
Does anyone have an idea where this code might be having a hiccup? I'm recieving an "expression using equals has components that are of different data types" as response in my log. Or does anyone have a better idea on how to achieve my result?
Thank you so much in advance!
PROC SQL;
CREATE TABLE test4 as
SELECT *,
case
when (ODED_date1 eq max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODHD_date1 eq max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODOO_date1 eq max(of ODED_date1 ODHD_date1 ODOO_date1)) then 1
else 0
end as max_OD_date1
from test3
group by ID
;
QUIT;
** keeps only observations 24 months after the initial event **;
DATA test5;
SET test4;
BY ID;
IF FIRST.ID THEN COUNTER=0;
IF COUNTER EQ 0 and max_OD_date1 EQ 1 THEN DO;
COUNTER=1;
OUTPUT;
END;
ELSE IF 1<=COUNTER<=24 THEN DO;
COUNTER+1;
OUTPUT;
END;
RUN;
Then I think that something like the following does what you want:
data test3; infile cards dlm=','; input ID MonthYear_r2 ODED_date1 ODHD_date1 ODOO_date1; cards; 1, 1, , , 1, 2, , , 1, 3, 3060, , 1, 4, , , 2, 1, , 3040, 2, 2, , 3, 1, , 2945, 2965 3, 2, , , ; DATA test5 (drop=max counter); do until (last.id); SET test3; BY ID; if first.id then max=max(of ODED_date1 ODHD_date1 ODOO_date1); else max=max(of max ODED_date1 ODHD_date1 ODOO_date1); end; do until (last.id); SET test3; BY ID; IF FIRST.ID THEN COUNTER=0; IF COUNTER eq 0 and max=max(of ODED_date1 ODHD_date1 ODOO_date1) THEN DO; COUNTER=1; OUTPUT; END; ELSE IF 1<=COUNTER<=24 THEN DO; COUNTER+1; OUTPUT; END; end; RUN;
Art, CEO, AnalystFinder.com
Please post the log with the code and error message into a code box opened with the forum {i} icon. The error message may provide details of where the error is occuring and posting in the code box reduces the likelihood of hte forum reformatting the text and removing some of the diagnostic information.
A likely issue revolves around this type of code:
max(of ODED_date1 ODHD_date1 ODOO_date1)
I would expect an error message underlining the first variable as the MAX function in SQL is not the same as the max function in a data step that uses "of".
Hi! Sorry- I'm not sure why this didn't post my reply yesterday... but I think you're right about the "of" statement in the maximum, do you have any ideas how to get around this?
85 86 87 PROC SQL; 88 CREATE TABLE test4 as 89 SELECT *, 90 case 91 when (ODED_date1 eq max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODHD_date1 eq ---------- ---------- 22 22 202 202 91 ! max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODOO_date1 eq max(of ODED_date1 ODHD_date1 91 ! ODOO_date1)) then 1 91 when (ODED_date1 eq max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODHD_date1 eq 91 ! max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODOO_date1 eq max(of ODED_date1 ODHD_date1 ---------- 22 91 ! ODOO_date1)) then 1 ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, (, ), *, **, +, ',', -, '.', /, <, <=, <>, =, >, >=, ?, AND, BETWEEN, CONTAINS, EQ, EQT, GE, GET, GT, GTT, IN, IS, LE, LET, LIKE, LT, LTT, NE, NET, NOT, NOTIN, OR, ^, ^=, |, ||, ~, ~=. ERROR 202-322: The option or parameter is not recognized and will be ignored. 91 when (ODED_date1 eq max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODHD_date1 eq 91 ! max(of ODED_date1 ODHD_date1 ODOO_date1)) OR (ODOO_date1 eq max(of ODED_date1 ODHD_date1 ---------- 202 91 ! ODOO_date1)) then 1 ERROR 202-322: The option or parameter is not recognized and will be ignored. 92 else 0 93 end as max_OD_date1 94 from test3 95 group by ID 96 ; 97 QUIT; NOTE: The SAS System stopped processing this step because of errors. NOTE: PROCEDURE SQL used (Total process time): real time 0.15 seconds cpu time 0.07 seconds 98 99 100 ** keeps only observations 24 months after the initial event **; 101 DATA test5; 102 SET test4; ERROR: File WORK.TEST4.DATA does not exist. 103 BY ID; 104 IF FIRST.ID THEN COUNTER=0; 105 IF COUNTER EQ 0 and max_OD_date1 EQ 1 THEN DO; 106 COUNTER=1; 107 OUTPUT; 108 END; 109 ELSE IF 1<=COUNTER<=24 THEN DO; 110 COUNTER+1; 111 OUTPUT; 112 END; 113 RUN; NOTE: The SAS System stopped processing this step because of errors. WARNING: The data set WORK.TEST5 may be incomplete. When this step was stopped there were 0 observations and 2 variables. NOTE: DATA statement used (Total process time): real time 0.04 seconds cpu time 0.01 seconds
Are you simply trying to (1)identify the record (within an ID) that highest date value for that ID (regardless of whether that value is in ODED_date1, ODHD_date1, or ODOO_date1) and then (2) output that record the up to 24 records that follow it?
Art, CEO, AnalystFinder.com
I am! Sorry for asking so many questions about it- I thought I could figure it out piecewise but I feel I'm making it more complicated.
Then I think that something like the following does what you want:
data test3; infile cards dlm=','; input ID MonthYear_r2 ODED_date1 ODHD_date1 ODOO_date1; cards; 1, 1, , , 1, 2, , , 1, 3, 3060, , 1, 4, , , 2, 1, , 3040, 2, 2, , 3, 1, , 2945, 2965 3, 2, , , ; DATA test5 (drop=max counter); do until (last.id); SET test3; BY ID; if first.id then max=max(of ODED_date1 ODHD_date1 ODOO_date1); else max=max(of max ODED_date1 ODHD_date1 ODOO_date1); end; do until (last.id); SET test3; BY ID; IF FIRST.ID THEN COUNTER=0; IF COUNTER eq 0 and max=max(of ODED_date1 ODHD_date1 ODOO_date1) THEN DO; COUNTER=1; OUTPUT; END; ELSE IF 1<=COUNTER<=24 THEN DO; COUNTER+1; OUTPUT; END; end; RUN;
Art, CEO, AnalystFinder.com
Wow, I would have never thought to run a do step until a last ID. That worked perfectly!
Seriously, thank you so much!! I've spent the last couple days trying to figure this out.
It's known as a double DOW loop and seemed like a logical way to accomplish what you were trying to do.
Art, CEO, AnalystFinder.com
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.