Hi Everyone,
As illustration purposes I have a dataset with the following two SAS year variables:
id start_year end_year
1 1991 1992
1 1992 1993
1 1993 1993
1 1993 1993
1 1993 1993
1 1994 1998
1 1998 2015
2 1993 1994
3 1985 1986
3 1990 1991
3 1992 2000
id=1 starts in 1991 and ends in 2015 with no gaps; same for id=2. But id=3 has a gap in 1987, 1988, and 1989
First, how can I track and know if there is a gap or no by id?
Second, how can I add the years gap by id to my data? For example,
id start_year end_year
1 1991 1992
1 1992 1993
1 1993 1993
1 1993 1993
1 1993 1993
1 1994 1998
1 1998 2015
2 1993 1994
3 1985 1986
3 1987 1987
3 1988 1988
3 1989 1989
3 1990 1991
3 1992 2000
Thank you very much
Hi.
Assuming data is previously ordered by id and start_year and there are no overlaping periods, here's another way:
data want;
set have;
drop _:;
* auxiliary vars;
_end_year=lag1(end_year);
_id=lag1(id);
output; * output row;
* cycle and fill the gap, if needed;
if _id eq id and start_year-1 ne _end_year then
do end_year=_end_year+1 to start_year-1;
start_year=end_year;
output;
put 'created row ' id= start_year= end_year=;
end;
run;
* order back data;
proc sort;
by id start_year;
run;
The lag function will get you the previous value, which is what you need to work the gaps.
More on the lag function here: https://support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/default/viewer.htm#n0l66p5oqe...
Hope it helps.
Daniel Santos @ www.cgd.pt
Messy, but just leaving so no time:
data have; input id start_year end_year; datalines; 1 1991 1992 1 1992 1993 1 1993 1993 1 1993 1993 1 1993 1993 1 1994 1998 1 1998 2015 2 1993 1994 3 1985 1986 3 1990 1991 3 1992 2000 ; run; data want; set have; by id; retain lst_year; if first.id then do; syear=start_year; eyear=end_year; lst_year=end_year; output; end; else do; if start_year ne lst_year and start_year-1 ne lst_year then do; do i=lst_year to start_year; syear=i; eyear=i; output; end; end; else do; syear=start_year; eyear=end_year; output; end; end; lst_year=end_year; run;
As noted, I have left for the day now. Try:
do i=lst_year to start_year;
Replace that with:
do i=(lst_year+1) to (start_year-1);
This solution nicely identifies under column i the year(s) in which there is a gap, but it does not add the row in the data for the gap as I indicated in red color in my original rpost.
Thanks a lot! What I can do to add it the row(s) attached to the year gap(s)? Thanks again
Techniques in this paper may help you getting your desired outcome
http://support.sas.com/resources/papers/proceedings10/103-2010.pdf
Ahmed
Hi.
Assuming data is previously ordered by id and start_year and there are no overlaping periods, here's another way:
data want;
set have;
drop _:;
* auxiliary vars;
_end_year=lag1(end_year);
_id=lag1(id);
output; * output row;
* cycle and fill the gap, if needed;
if _id eq id and start_year-1 ne _end_year then
do end_year=_end_year+1 to start_year-1;
start_year=end_year;
output;
put 'created row ' id= start_year= end_year=;
end;
run;
* order back data;
proc sort;
by id start_year;
run;
The lag function will get you the previous value, which is what you need to work the gaps.
More on the lag function here: https://support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/default/viewer.htm#n0l66p5oqe...
Hope it helps.
Daniel Santos @ www.cgd.pt
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.