How can i achieve the value based on last YEAR (previous row)?
I have a set of data with the Key
MOHX_ID | MOHX_YEAR | MOHX_SEQ | MOHX_ENDST |
With the key listed above, i wish to get the new field "PREV_YEAR_NC" based on previous year's NC value.
Sample as below. Notice red color value is what i want to achieve.
Final table:
data mohx_data;
input MOHX_ID $ MOHX_YEAR MOHX_SEQ MOHX_ENDST NC PREV_YEAR_NC;
datalines;
A 2023 1 0 60 .
A 2023 2 0 30 .
A 2024 1 0 60 60
A 2025 1 0 60 60
B 2023 1 0 25 .
B 2023 2 0 35 .
B 2024 1 0 45 25
B 2025 1 0 60 45
;
run;
Hi @StickyRoll, it looks like its not alway the value of the prvious row you are looking for. So what is the complete rule?
Best, Markus
Hello. It is always based on last year + key
MOHX_ID | MOHX_YEAR | MOHX_SEQ | MOHX_ENDST |
Hi @StickyRoll, and what is the rule by which the key combination, for which the value should be selected, is chosen?
data mohx_data;
input MOHX_ID $ MOHX_YEAR MOHX_SEQ MOHX_ENDST NC;
datalines;
A 2023 1 0 60
A 2023 2 0 30
A 2024 1 0 60
A 2025 1 0 60
B 2023 1 0 25
B 2023 2 0 35
B 2024 1 0 45
B 2025 1 0 60
;
run;
data want;
if _n_=1 then do;
if 0 then set mohx_data(rename=(NC=PREV_YEAR_NC));
declare hash h(dataset:'mohx_data(rename=(NC=PREV_YEAR_NC))');
h.definekey('MOHX_ID', 'MOHX_YEAR', 'MOHX_SEQ');
h.definedata('PREV_YEAR_NC');
h.definedone();
end;
set mohx_data;
call missing(PREV_YEAR_NC);
prev_year=MOHX_YEAR-1;
rc=h.find(key:MOHX_ID,key:prev_year,key:MOHX_SEQ);
drop prev_year rc;
run;
This is a good use case for embedding a LAG function inside an IFN (or IFC) function, as in:
data mohx_data;
input MOHX_ID $ MOHX_YEAR MOHX_SEQ MOHX_ENDST NC;
datalines;
A 2023 1 0 60
A 2023 2 0 30
A 2024 1 0 60
A 2025 1 0 60
B 2023 1 0 25
B 2023 2 0 35
B 2024 1 0 45
B 2025 1 0 60
run;
data want;
set mohx_data;
by mohx_id mohx_year;
retain prv_nc .;
if first.mohx_year then prv_nc=ifn(first.mohx_id,.,lag(nc));
run;
I assume when there are multiple obs for a single mohx_year (and therefore multiple possible NC values), that you want to carry forward the NC value for the first of those multiple obs.
When you start a new YEAR remember the previous year's value.
if first.mohx_year then newvar=lag(NC);
retain newvar;
But when you start a new ID forget it again.
if first.mohx_id then call missing(newvar);
So data step will look like this:
data want;
set mohx_data;
by mohx_id mohx_year;
if first.mohx_year then newvar=lag(NC);
if first.mohx_id then call missing(newvar);
retain newvar;
run;
Results
Thanks everyone. All the solution here works. I wish i could mark all the answers as solution.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.