Hi, everyone.
I have a dataset with 3 columns, as shown above. The first column, COD_SEG, represents the internal code for each person. The second column, DT_OCORR, represents the date of a specific event (format yymmd7.). The last column,, CIA_ESCOLHIDA, provides a reference for the corresponding COD_SEG and DT_OCORR.
My date variable (DT_OCORR) is skipping (jumping) months. I need to insert the missing dates for each COD_SEG. When doing this, I need to fill the other 2 columns with the information from the previous row. Like this.
I have:
COD_SEG | DT_OCORR | CIA_ESCOLHIDA |
37192 | 2021-08 | 1806 |
37192 | 2022-02 | 1806 |
37192 | 2022-03 | 1806 |
37192 | 2022-04 | 1806 |
37192 | 2022-06 | 1806 |
37192 | 2022-07 | 1806 |
37192 | 2022-08 | 1806 |
37192 | 2022-09 | 1806 |
37192 | 2022-10 | 2110 |
37192 | 2022-12 | 2110 |
37192 | 2023-01 | 2110 |
I need (I have bolded the lines that need to be inserted):
COD_SEG | DT_OCORR | CIA_ESCOLHIDA |
37192 | 2021-08 | 1806 |
37192 | 2021-09 | 1806 |
37192 | 2021-10 | 1806 |
37192 | 2021-11 | 1806 |
37192 | 2021-12 | 1806 |
37192 | 2022-01 | 1806 |
37192 | 2022-02 | 1806 |
37192 | 2022-03 | 1806 |
37192 | 2022-04 | 1806 |
37192 | 2022-05 | 1806 |
37192 | 2022-06 | 1806 |
37192 | 2022-07 | 1806 |
37192 | 2022-08 | 1806 |
37192 | 2022-09 | 1806 |
37192 | 2022-10 | 2110 |
37192 | 2022-11 | 2110 |
37192 | 2022-12 | 2110 |
37192 | 2023-01 | 2110 |
My database have so many rows. I need an way to do this for a lot of records.
Anyone can help me? I'm using SAS Enterprise Guide.
data have;
infile cards expandtabs truncover;
input COD_SEG DT_OCORR :anydtdte. CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
cards;
37192 2021-08 1806
37192 2022-02 1806
37192 2022-03 1806
37192 2022-04 1806
37192 2022-06 1806
37192 2022-07 1806
37192 2022-08 1806
37192 2022-09 1806
37192 2022-10 2110
37192 2022-12 2110
37192 2023-01 2110
;
proc timeseries data = have out = want;
by COD_SEG;
id DT_OCORR interval = month
setmiss = prev;
var CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
run;
Result:
COD_SEG DT_OCORR CIA_ESCOLHIDA 37192 2021-08 1806 37192 2021-09 1806 37192 2021-10 1806 37192 2021-11 1806 37192 2021-12 1806 37192 2022-01 1806 37192 2022-02 1806 37192 2022-03 1806 37192 2022-04 1806 37192 2022-05 1806 37192 2022-06 1806 37192 2022-07 1806 37192 2022-08 1806 37192 2022-09 1806 37192 2022-10 2110 37192 2022-11 2110 37192 2022-12 2110 37192 2023-01 2110
data have;
infile cards expandtabs truncover;
input COD_SEG DT_OCORR :anydtdte. CIA_ESCOLHIDA;
format DT_OCORR :yymmd7.;
cards;
37192 2021-08 1806
37192 2022-02 1806
37192 2022-03 1806
37192 2022-04 1806
37192 2022-06 1806
37192 2022-07 1806
37192 2022-08 1806
37192 2022-09 1806
37192 2022-10 2110
37192 2022-12 2110
37192 2023-01 2110
;
proc sort data=have out=temp;
by COD_SEG DT_OCORR;
run;
data want;
merge temp temp(firstobs=2 keep=COD_SEG DT_OCORR rename=(COD_SEG=_COD_SEG DT_OCORR=_DT_OCORR));
output;
if COD_SEG=_COD_SEG then do;
do i=1 to intck('month',DT_OCORR,_DT_OCORR)-1;
DT_OCORR=intnx('month',DT_OCORR,1);output;
end;
end;
drop i _COD_SEG _DT_OCORR;
run;
data have;
infile cards expandtabs truncover;
input COD_SEG DT_OCORR :anydtdte. CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
cards;
37192 2021-08 1806
37192 2022-02 1806
37192 2022-03 1806
37192 2022-04 1806
37192 2022-06 1806
37192 2022-07 1806
37192 2022-08 1806
37192 2022-09 1806
37192 2022-10 2110
37192 2022-12 2110
37192 2023-01 2110
;
proc timeseries data = have out = want;
by COD_SEG;
id DT_OCORR interval = month
setmiss = prev;
var CIA_ESCOLHIDA;
format DT_OCORR yymmd7.;
run;
Result:
COD_SEG DT_OCORR CIA_ESCOLHIDA 37192 2021-08 1806 37192 2021-09 1806 37192 2021-10 1806 37192 2021-11 1806 37192 2021-12 1806 37192 2022-01 1806 37192 2022-02 1806 37192 2022-03 1806 37192 2022-04 1806 37192 2022-05 1806 37192 2022-06 1806 37192 2022-07 1806 37192 2022-08 1806 37192 2022-09 1806 37192 2022-10 2110 37192 2022-11 2110 37192 2022-12 2110 37192 2023-01 2110
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.