I was using the answer to this question, in order to return the column name with the latest date. My data is in this format:
DATA have;
INPUT id away :DATE9. home :DATE9. travel :DATE9.;
FORMAT away home travel DATE9.;
DATALINES;
1 . . 01JAN2020
2 03JAN2020 . 03FEB2020
3 04FEB2020 01JAN2020 12FEB2020
;
Using the solution to the other question I have this code (I didn't want to include the ID in the array, so I had to drop it, then bring it back in):
DATA want;
SET have(drop=id);
ARRAY dates[*] _numeric_;
SET have;
latest = max(of dates[*]);
maxv = vname(dates[whichn(latest, of dates(*))]);
RUN;
Everything runs fun if I leave out the line maxv = vname..., but leaving it in gives me an error:
ERROR: Array subscript out of range at line 1068 column 18.
What is causing this error to come up in the last line of the code?
HI @MB_Analyst Your code successfully ran when i tested. Here is the log
1716 DATA have;
1717 INPUT id away :DATE9. home :DATE9. travel :DATE9.;
1718 FORMAT away home travel DATE9.;
1719 DATALINES;
NOTE: The data set WORK.HAVE has 3 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1723 ;
1724
1725 DATA want;
1726 SET have(drop=id);
1727 ARRAY dates[*] _numeric_;
1728 SET have;
1729 latest = max(of dates[*]);
1730 maxv = vname(dates[whichn(latest, of dates(*))]);
1731 RUN;
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 3 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Btw, this makes me wonder whether your sample data is a good representative of your real?
Also does any of your records have all missings that eventually makes the MAX value also as missing? Please verify
I initially did this so I could use _NUMERIC_ to define the array, without using the ID variable. I still need the ID variable - is there a better way to achieve this? The data set I'm using has many variables for the array, and are likely to change. Should I force the ID as a character instead?
HI @MB_Analyst Your code successfully ran when i tested. Here is the log
1716 DATA have;
1717 INPUT id away :DATE9. home :DATE9. travel :DATE9.;
1718 FORMAT away home travel DATE9.;
1719 DATALINES;
NOTE: The data set WORK.HAVE has 3 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
1723 ;
1724
1725 DATA want;
1726 SET have(drop=id);
1727 ARRAY dates[*] _numeric_;
1728 SET have;
1729 latest = max(of dates[*]);
1730 maxv = vname(dates[whichn(latest, of dates(*))]);
1731 RUN;
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: There were 3 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 3 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Btw, this makes me wonder whether your sample data is a good representative of your real?
Also does any of your records have all missings that eventually makes the MAX value also as missing? Please verify
Good and I am glad. You are not alone. I have had that issue many times. The thumb rule is always to be very cognizant of missing values and its propagation.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.