@DID wrote: Thank you Jim. How do I know which day SAS is seeing to know which "sameday" SAS is using? When I used the nodupkey option to drop duplicate categorical month/days in order to get the dataset the sample comes from, I don't know which of the days it dropped from among the numerical month/yrs.
What I would do is extract the months and years and create a new column that is a combination of year and month. I'm going to put some code below, and you can see me do that. Then do your Sort with a Nodupkey. After you've done your Sort Nodupkey, you don't care which day it is (again, see code, below), because we're going to
Extract the year and month
Concatenate the year and month into a YYMM field
Compare the YYMM from the previous row with what we actually get with the current row. When done this way, the day of the month does not matter.
Results:
Caveats: My code is not all of what you need, but you should be able to learn from it and be able to tweak it so that is exactly what you need. For example, this code will only catch one missing month. You'd actually have to create a loop for the case that you need to create more than one missing month. For example, if you have Jan, Feb, Mar, and May, this code will fill in Apr for you. But if you have Jan, Feb, Mar, and Jun, this code would fill in Apr but miss May. You'll need to adjust accordingly.
Take a look at the code. Try some of the techniques I'm giving you examples of, and then let's discuss further when questions or problems arise.
Jim
DATA Raw_Months;
DROP _:;
RETAIN _First_Time 1;
/* FORMAT o_monyrc DATE9.*/
/* o_moyr DATE9.*/
/* _Desired_Date DATE9.*/
/* _prev_o_monyrc DATE9.*/
/* _prev_o_moyr DATE9.*/
/* _save_o_monyrc DATE9.*/
/* _save_o_moyr DATE9.*/
/* ;*/
FORMAT o_monyrc MONYY7.
o_moyr MONYY7.
_Desired_Date MONYY7.
_prev_o_monyrc MONYY7.
_prev_o_moyr MONYY7.
_save_o_monyrc MONYY7.
_save_o_moyr MONYY7.
;
LENGTH
_o_monyrc $7
_o_moyr $7
tract $3
trctpop $4
crimecount $2
shrblk $4
numhhs $4
fmcn $2
ffhn $3
ffhd $3
educ8 $3
educ11 $3
;
LENGTH
_prev_o_monyrc 8.
_prev_o_moyr 8.
_prev_tract $3
_prev_trctpop $4
_prev_crimecount $2
_prev_shrblk $4
_prev_numhhs $4
_prev_fmcn $2
_prev_ffhn $3
_prev_ffhd $3
_prev_educ8 $3
_prev_educ11 $3
;
LENGTH
_save_o_monyrc 8.
_save_o_moyr 8.
_save_tract $3
_save_trctpop $4
_save_crimecount $2
_save_shrblk $4
_save_numhhs $4
_save_fmcn $2
_save_ffhn $3
_save_ffhd $3
_save_educ8 $3
_save_educ11 $3
;
RETAIN
_prev_o_monyrc
_prev_o_moyr
_prev_tract
_prev_trctpop
_prev_crimecount
_prev_shrblk
_prev_numhhs
_prev_fmcn
_prev_ffhn
_prev_ffhd
_prev_educ8
_prev_educ11
;
INFILE DATALINES4 DSD DLM='09'X;
INPUT
_o_monyrc $
_o_moyr $
tract $
trctpop $
crimecount $
shrblk $
numhhs $
fmcn $
ffhn $
ffhd $
educ8 $
educ11 $
;
o_monyrc = INPUT(_o_monyrc, ANYDTDTE7.);
o_moyr = INPUT(_o_moyr, ANYDTDTE7.);
IF _First_Time THEN
DO;
PUTLOG "&Nte1 Setting _First_Time to 0";
_First_Time = 0;
END;
ELSE
DO;
_Desired_Date = INTNX('MONTH', _prev_o_monyrc, 1, 'SAMEDAY');
_Desired_Month = MONTH(_Desired_Date);
_Desired_Year = YEAR(_Desired_Date);
_Actual_Month = MONTH(o_monyrc);
_Actual_Year = YEAR(o_monyrc);
_Actual_YYMM = (_Actual_Year * 100) + _Actual_Month;
_Desired_YYMM = (_Desired_Year * 100) + _Desired_Month;
PUTLOG "&Nte1 Desired Year is " _Desired_Year;
PUTLOG "&Nte1 Desired Month is " _Desired_Month;
PUTLOG "&Nte1 Desired YYMM is " _Desired_YYMM;
PUTLOG "&Nte1 Actual Year is " _Actual_Year;
PUTLOG "&Nte1 Actual Month is " _Actual_Month;
PUTLOG "&Nte1 Actual YYMM is " _Actual_YYMM;
IF _Actual_YYMM > _Desired_YYMM THEN
DO;
PUTLOG "&Nte1 Adding a missing month " _Desired_Month=;
_save_o_monyrc = o_monyrc ;
_save_o_moyr = o_moyr ;
_save_tract = tract ;
_save_trctpop = trctpop ;
_save_crimecount = crimecount ;
_save_shrblk = shrblk ;
_save_numhhs = numhhs ;
_save_fmcn = fmcn ;
_save_ffhn = ffhn ;
_save_ffhd = ffhd ;
_save_educ8 = educ8 ;
_save_educ11 = educ11 ;
o_monyrc = _Desired_Date ;
o_moyr = _Desired_Date ;
tract = _prev_tract ;
trctpop = _prev_trctpop ;
crimecount = '0' ;
shrblk = _prev_shrblk ;
numhhs = _prev_numhhs ;
fmcn = _prev_fmcn ;
ffhn = _prev_ffhn ;
ffhd = _prev_ffhd ;
educ8 = _prev_educ8 ;
educ11 = _prev_educ11 ;
OUTPUT;
o_monyrc = _save_o_monyrc ;
o_moyr = _save_o_moyr ;
tract = _save_tract ;
trctpop = _save_trctpop ;
crimecount = _save_crimecount ;
shrblk = _save_shrblk ;
numhhs = _save_numhhs ;
fmcn = _save_fmcn ;
ffhn = _save_ffhn ;
ffhd = _save_ffhd ;
educ8 = _save_educ8 ;
educ11 = _save_educ11 ;
END;
END;
_prev_o_monyrc = o_monyrc ;
_prev_o_moyr = o_moyr ;
_prev_tract = tract ;
_prev_trctpop = trctpop ;
_prev_crimecount = crimecount ;
_prev_shrblk = shrblk ;
_prev_numhhs = numhhs ;
_prev_fmcn = fmcn ;
_prev_ffhn = ffhn ;
_prev_ffhd = ffhd ;
_prev_educ8 = educ8 ;
_prev_educ11 = educ11 ;
OUTPUT;
DATALINES4;
JAN1990 JAN1990 305 2583 8 2493 1133 44 191 257 295 541
FEB1990 FEB1990 305 2583 4 2493 1133 44 191 257 295 541
MAR1990 MAR1990 305 2583 1 2493 1133 44 191 257 295 541
MAY1990 MAY1990 305 2583 3 2493 1133 44 191 257 295 541
JUN1990 JUN1990 305 2583 11 2493 1133 44 191 257 295 541
JUL1990 JUL1990 305 2583 7 2493 1133 44 191 257 295 541
AUG1990 AUG1990 305 2583 7 2493 1133 44 191 257 295 541
;;;;
RUN;
... View more