Hi Everyone,
As illustration purposes I have a dataset with the following two SAS year variables:
id start_year end_year
1 1991 1992
1 1992 1993
1 1993 1993
1 1993 1993
1 1993 1993
1 1994 1998
1 1998 2015
2 1993 1994
3 1985 1986
3 1990 1991
3 1992 2000
id=1 starts in 1991 and ends in 2015 with no gaps; same for id=2. But id=3 has a gap in 1987, 1988, and 1989
First, how can I track and know if there is a gap or no by id?
Second, how can I add the years gap by id to my data? For example,
id start_year end_year
1 1991 1992
1 1992 1993
1 1993 1993
1 1993 1993
1 1993 1993
1 1994 1998
1 1998 2015
2 1993 1994
3 1985 1986
3 1987 1987
3 1988 1988
3 1989 1989
3 1990 1991
3 1992 2000
Thank you very much
Hi.
Assuming data is previously ordered by id and start_year and there are no overlaping periods, here's another way:
data want;
set have;
drop _:;
* auxiliary vars;
_end_year=lag1(end_year);
_id=lag1(id);
output; * output row;
* cycle and fill the gap, if needed;
if _id eq id and start_year-1 ne _end_year then
do end_year=_end_year+1 to start_year-1;
start_year=end_year;
output;
put 'created row ' id= start_year= end_year=;
end;
run;
* order back data;
proc sort;
by id start_year;
run;
The lag function will get you the previous value, which is what you need to work the gaps.
More on the lag function here: https://support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/default/viewer.htm#n0l66p5oqe...
Hope it helps.
Daniel Santos @ www.cgd.pt
Messy, but just leaving so no time:
data have; input id start_year end_year; datalines; 1 1991 1992 1 1992 1993 1 1993 1993 1 1993 1993 1 1993 1993 1 1994 1998 1 1998 2015 2 1993 1994 3 1985 1986 3 1990 1991 3 1992 2000 ; run; data want; set have; by id; retain lst_year; if first.id then do; syear=start_year; eyear=end_year; lst_year=end_year; output; end; else do; if start_year ne lst_year and start_year-1 ne lst_year then do; do i=lst_year to start_year; syear=i; eyear=i; output; end; end; else do; syear=start_year; eyear=end_year; output; end; end; lst_year=end_year; run;
As noted, I have left for the day now. Try:
do i=lst_year to start_year;
Replace that with:
do i=(lst_year+1) to (start_year-1);
This solution nicely identifies under column i the year(s) in which there is a gap, but it does not add the row in the data for the gap as I indicated in red color in my original rpost.
Thanks a lot! What I can do to add it the row(s) attached to the year gap(s)? Thanks again
Techniques in this paper may help you getting your desired outcome
http://support.sas.com/resources/papers/proceedings10/103-2010.pdf
Ahmed
Hi.
Assuming data is previously ordered by id and start_year and there are no overlaping periods, here's another way:
data want;
set have;
drop _:;
* auxiliary vars;
_end_year=lag1(end_year);
_id=lag1(id);
output; * output row;
* cycle and fill the gap, if needed;
if _id eq id and start_year-1 ne _end_year then
do end_year=_end_year+1 to start_year-1;
start_year=end_year;
output;
put 'created row ' id= start_year= end_year=;
end;
run;
* order back data;
proc sort;
by id start_year;
run;
The lag function will get you the previous value, which is what you need to work the gaps.
More on the lag function here: https://support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/default/viewer.htm#n0l66p5oqe...
Hope it helps.
Daniel Santos @ www.cgd.pt
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.