Hello SAS experts,
I have already asked a similar question previously.
My previous question is how to fill in missing year and create a dummy=0 by id.
But if I would like to add one more criteria: for those who only have 1 record id, giving the number variable as "zero" if it didn't have any following record until the observed year.
For example, id=d, it only has 2016 record. I would like to have it has dummy variable until the end of the observed year (2020).
Can anyone give me a hand?
Many thanks in advance.
DATA HAVE; INPUT YEAR ID $ NUMBER; CARDS; 2017 a 1 2018 a 1 2020 a 1 2016 b 1 2020 b 1 2018 c 1 2020 c 1
2016 d 1
2018 e 1 ; RUN;
I wish my final data look like below. Many thanks for the advice in advance!
2017 a 1
2018 a 1
2019 a 0
2020 a 1
2016 b 1
2017 b 0
2018 b 0
2019 b 0
2020 b 1
2018 c 1
2019 c 0
2020 c 1
2016 d 1
2017 d 0
2018 d 0
2019 d 0
2020 d 0
2018 e 1
2019 e 0
2019 e 0
2020 e 0
proc freq data=have noprint;
table year*id / out=want sparse;
weight number;
run;
Does this work or give you too many categories? If it's too much then you need to use PRELOADFMT option instead typically. You can probably find a workaround as it's numeric/year.
https://www.lexjansen.com/nesug/nesug11/cc/cc29.pdf
The pseudo code is this:
1) make a one-variable data set with variable YEAR containing all consecutive years just once per year
2) left join it to your dataset by year (this will create gaps in ID and NUMBER for the added observations)
3) use Last Observation Carry Forward (LOCF) on ID to fill in the gaps
4) if NUMBER is missing set it to zero.
@pink_poodle Thank you so much for the advice. I will try it tomorrow!!
@Reeza , Thank you so much for the advice! The following codes work but create number=0 for those years before index year.
Since I am really at the beginner level of SAS, PRELOADFMT seems hard for me to fit in the solution.
However, what I can think of is: create a series number by id in the have data file, run the program you advice, and then delete those series number are missing.
Again, thank you so much. I feel this SAS forum is super supportive!!
Use a self-merge with lookahead:
DATA HAVE;
INPUT YEAR ID $ NUMBER;
CARDS;
2017 a 1
2018 a 1
2020 a 1
2016 b 1
2020 b 1
2018 c 1
2020 c 1
2016 d 1
2018 e 1
RUN;
data want (drop=nxt_:);
merge have
have (firstobs=2 keep=year id rename=(year=nxt_yr id=nxt_id));
if nxt_id^=id then nxt_yr=2021;
do year=year to nxt_yr-1;
output;
number=0;
end;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.