BookmarkSubscribeRSS Feed
chimei0403
Obsidian | Level 7

Hello SAS experts, 

I have already asked a similar question previously.

My previous question is how to fill in missing year and create a dummy=0 by id.

But if I would like to add one more criteria: for those who only have 1 record id, giving the number variable as "zero" if it didn't have any following record until the observed year.  

For example, id=d, it only has 2016 record. I would like to have it has dummy variable until the end of the observed year (2020).

Can anyone give me a hand?

Many thanks in advance. 

 

 

DATA HAVE;
	INPUT  YEAR ID $ NUMBER;
    CARDS;
 2017 a   1
 2018 a   1
 2020 a   1
 2016 b   1
 2020 b   1
 2018 c   1
 2020 c   1
2016 d 1
2018 e 1 ; RUN;

 

I wish my final data look like below. Many thanks for the advice in advance!

 

2017    a    1
2018    a    1
2019    a    0
2020    a    1
2016    b    1
2017    b    0
2018    b    0
2019    b    0
2020    b    1
2018    c    1
2019    c    0
2020    c    1

2016    d    1

2017    d    0

2018    d    0

2019   d    0 

2020  d     0

2018   e   1

2019   e   0

2019   e   0

2020  e    0   

5 REPLIES 5
Reeza
Super User
proc freq data=have noprint;
table year*id / out=want sparse;
weight number;
run;

Does this work or give you too many categories? If it's too much then you need to use PRELOADFMT option instead typically. You can probably find a workaround as it's numeric/year. 

 

https://www.lexjansen.com/nesug/nesug11/cc/cc29.pdf

 

 

pink_poodle
Barite | Level 11

The pseudo code is this:
1) make a one-variable data set with variable YEAR containing all consecutive years just once per year
2) left join it to your dataset by year (this will create gaps in ID and NUMBER for the added observations)
3) use Last Observation Carry Forward (LOCF) on ID to fill in the gaps
4) if NUMBER is missing set it to zero.

chimei0403
Obsidian | Level 7

@pink_poodle Thank you so much for the advice. I will try it tomorrow!!

chimei0403
Obsidian | Level 7

@Reeza , Thank you so much for the advice! The following codes work but create number=0 for those years before index year.

Since I am really at the beginner level of SAS, PRELOADFMT seems hard for me to fit in the solution.

However, what I can think of is: create a series number by id in the have data file, run the program you advice, and then delete those series number are missing.

Again, thank you so much. I feel this SAS forum is super supportive!!

mkeintz
PROC Star

Use a self-merge with lookahead:

 

DATA HAVE;
	INPUT  YEAR ID $ NUMBER;
    CARDS;
 2017 a   1
 2018 a   1
 2020 a   1
 2016 b   1
 2020 b   1
 2018 c   1
 2020 c   1
 2016 d   1
 2018 e   1
RUN;
data want (drop=nxt_:);
  merge have
        have (firstobs=2 keep=year id rename=(year=nxt_yr id=nxt_id));
  if nxt_id^=id then nxt_yr=2021;
  do year=year to nxt_yr-1;
    output;
    number=0;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1326 views
  • 1 like
  • 4 in conversation