Hi, everyone.
I have a dataset with 3 columns, as shown above. The first column, COD_SEG, represents the internal code for each person. The second column, DT_OCORR, represents the date of a specific event (format yymmd7.). The last column,, CIA_ESCOLHIDA, provides a reference for the corresponding COD_SEG and DT_OCORR.
My date variable (DT_OCORR) is skipping (jumping) months. I need to insert the missing dates for each COD_SEG. When doing this, I need to fill the other 2 columns with the information from the previous row. Like this.
I have:
COD_SEG | DT_OCORR | CIA_ESCOLHIDA |
37192 | 2021-08 | 1806 |
37192 | 2022-02 | 1806 |
37192 | 2022-03 | 1806 |
37192 | 2022-04 | 1806 |
37192 | 2022-06 | 1806 |
37192 | 2022-07 | 1806 |
37192 | 2022-08 | 1806 |
37192 | 2022-09 | 1806 |
37192 | 2022-10 | 2110 |
37192 | 2022-12 | 2110 |
37192 | 2023-01 | 2110 |
I need (I have bolded the lines that need to be inserted):
COD_SEG | DT_OCORR | CIA_ESCOLHIDA |
37192 | 2021-08 | 1806 |
37192 | 2021-09 | 1806 |
37192 | 2021-10 | 1806 |
37192 | 2021-11 | 1806 |
37192 | 2021-12 | 1806 |
37192 | 2022-01 | 1806 |
37192 | 2022-02 | 1806 |
37192 | 2022-03 | 1806 |
37192 | 2022-04 | 1806 |
37192 | 2022-05 | 1806 |
37192 | 2022-06 | 1806 |
37192 | 2022-07 | 1806 |
37192 | 2022-08 | 1806 |
37192 | 2022-09 | 1806 |
37192 | 2022-10 | 2110 |
37192 | 2022-11 | 2110 |
37192 | 2022-12 | 2110 |
37192 | 2023-01 | 2110 |
My database have so many rows. I need an way to do this for a lot of records.
Anyone can help me? I'm using SAS Enterprise Guide.
data have;
infile cards expandtabs truncover;
input COD_SEG DT_OCORR :anydtdte. CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
cards;
37192 2021-08 1806
37192 2022-02 1806
37192 2022-03 1806
37192 2022-04 1806
37192 2022-06 1806
37192 2022-07 1806
37192 2022-08 1806
37192 2022-09 1806
37192 2022-10 2110
37192 2022-12 2110
37192 2023-01 2110
;
proc timeseries data = have out = want;
by COD_SEG;
id DT_OCORR interval = month
setmiss = prev;
var CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
run;
Result:
COD_SEG DT_OCORR CIA_ESCOLHIDA 37192 2021-08 1806 37192 2021-09 1806 37192 2021-10 1806 37192 2021-11 1806 37192 2021-12 1806 37192 2022-01 1806 37192 2022-02 1806 37192 2022-03 1806 37192 2022-04 1806 37192 2022-05 1806 37192 2022-06 1806 37192 2022-07 1806 37192 2022-08 1806 37192 2022-09 1806 37192 2022-10 2110 37192 2022-11 2110 37192 2022-12 2110 37192 2023-01 2110
data have;
infile cards expandtabs truncover;
input COD_SEG DT_OCORR :anydtdte. CIA_ESCOLHIDA;
format DT_OCORR :yymmd7.;
cards;
37192 2021-08 1806
37192 2022-02 1806
37192 2022-03 1806
37192 2022-04 1806
37192 2022-06 1806
37192 2022-07 1806
37192 2022-08 1806
37192 2022-09 1806
37192 2022-10 2110
37192 2022-12 2110
37192 2023-01 2110
;
proc sort data=have out=temp;
by COD_SEG DT_OCORR;
run;
data want;
merge temp temp(firstobs=2 keep=COD_SEG DT_OCORR rename=(COD_SEG=_COD_SEG DT_OCORR=_DT_OCORR));
output;
if COD_SEG=_COD_SEG then do;
do i=1 to intck('month',DT_OCORR,_DT_OCORR)-1;
DT_OCORR=intnx('month',DT_OCORR,1);output;
end;
end;
drop i _COD_SEG _DT_OCORR;
run;
data have;
infile cards expandtabs truncover;
input COD_SEG DT_OCORR :anydtdte. CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
cards;
37192 2021-08 1806
37192 2022-02 1806
37192 2022-03 1806
37192 2022-04 1806
37192 2022-06 1806
37192 2022-07 1806
37192 2022-08 1806
37192 2022-09 1806
37192 2022-10 2110
37192 2022-12 2110
37192 2023-01 2110
;
proc timeseries data = have out = want;
by COD_SEG;
id DT_OCORR interval = month
setmiss = prev;
var CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
run;
Result:
COD_SEG DT_OCORR CIA_ESCOLHIDA 37192 2021-08 1806 37192 2021-09 1806 37192 2021-10 1806 37192 2021-11 1806 37192 2021-12 1806 37192 2022-01 1806 37192 2022-02 1806 37192 2022-03 1806 37192 2022-04 1806 37192 2022-05 1806 37192 2022-06 1806 37192 2022-07 1806 37192 2022-08 1806 37192 2022-09 1806 37192 2022-10 2110 37192 2022-11 2110 37192 2022-12 2110 37192 2023-01 2110
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.