## tracking years across two columns by id

Solved
Occasional Contributor
Posts: 9

# tracking years across two columns by id

[ Edited ]

Hi Everyone,

As illustration purposes I have a dataset with the following two SAS year variables:

id     start_year     end_year

1      1991            1992

1      1992            1993

1      1993            1993

1      1993            1993

1      1993            1993

1      1994            1998

1      1998            2015

2      1993            1994

3      1985            1986

3      1990            1991

3      1992            2000

id=1 starts in 1991 and ends in 2015 with no gaps; same for id=2. But id=3 has a gap in 1987, 1988, and 1989

First, how can I track and know if there is a gap or no by id?

Second, how can I add the years gap by id to my data? For example,

id     start_year     end_year

1      1991            1992

1      1992            1993

1      1993            1993

1      1993            1993

1      1993            1993

1      1994            1998

1      1998            2015

2      1993            1994

3      1985            1986

3      1987            1987

3      1988            1988

3      1989            1989

3      1990            1991

3      1992            2000

Thank you very much

Accepted Solutions
Solution
‎01-13-2017 01:16 PM
Super Contributor
Posts: 474

## Re: tracking years across two columns by id

[ Edited ]

Hi.

Assuming data is previously ordered by id and start_year and there are no overlaping periods, here's another way:

`data want;     set have;     drop _:;     * auxiliary vars;     _end_year=lag1(end_year);     _id=lag1(id);     output; * output row;     * cycle and fill the gap, if needed;     if _id eq id and start_year-1 ne _end_year then     do end_year=_end_year+1 to start_year-1;        start_year=end_year;        output;        put 'created row ' id= start_year= end_year=;     end;run;* order back data;proc sort;     by id start_year;run;`

The lag function will get you the previous value, which is what you need to work the gaps.

More on the lag function here: https://support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/default/viewer.htm#n0l66p5oqe...

Hope it helps.

Daniel Santos @ www.cgd.pt

All Replies
Super User
Posts: 9,400

## Re: tracking years across two columns by id

Messy, but just leaving so no time:

```data have;
input id     start_year     end_year;
datalines;
1      1991            1992
1      1992            1993
1      1993            1993
1      1993            1993
1      1993            1993
1      1994            1998
1      1998            2015
2      1993            1994
3      1985            1986
3      1990            1991
3      1992            2000
;
run;

data want;
set have;
by id;
retain lst_year;
if first.id then do;
syear=start_year;
eyear=end_year;
lst_year=end_year;
output;
end;
else do;
if start_year ne lst_year and start_year-1 ne lst_year then do;
do i=lst_year to start_year;
syear=i;
eyear=i;
output;
end;
end;
else do;
syear=start_year;
eyear=end_year;
output;
end;
end;
lst_year=end_year;
run;```
Occasional Contributor
Posts: 9

## Re: tracking years across two columns by id

Hi RW9,

Your code kind of works but not precisely.

If you see your want table your code assigns a value of 1990 to obs. 14 for id = 3 when actually there is no a gap issue for that row. So, it should be a period value.

On the other hand, the original data for id=3 has a gap for years 1987,1988 and 1989. Hence, your code should add three new rows for these three years. However, it adds five rows. In fact, what I would expect based on my question below is that row 10 is 1987, row 11 is 1988, and row 12 is 1989, with the remaining two rows for what it was.

Do you any any solutions for these two problems? Thank you
Super User
Posts: 9,400

## Re: tracking years across two columns by id

As noted, I have left for the day now.  Try:

```      do i=lst_year to start_year;
```

Replace that with:

```      do i=(lst_year+1) to (start_year-1);
```

Occasional Contributor
Posts: 9

## Re: tracking years across two columns by id

[ Edited ]

This solution nicely identifies under column i the year(s) in which there is a gap, but it does not add the row in the data for the gap as I indicated in red color in my original rpost.

Thanks a lot! What I can do to add it the row(s) attached to the year gap(s)? Thanks again

Super Contributor
Posts: 274

## Re: tracking years across two columns by id

http://support.sas.com/resources/papers/proceedings10/103-2010.pdf

Ahmed

Solution
‎01-13-2017 01:16 PM
Super Contributor
Posts: 474

## Re: tracking years across two columns by id

[ Edited ]

Hi.

Assuming data is previously ordered by id and start_year and there are no overlaping periods, here's another way:

`data want;     set have;     drop _:;     * auxiliary vars;     _end_year=lag1(end_year);     _id=lag1(id);     output; * output row;     * cycle and fill the gap, if needed;     if _id eq id and start_year-1 ne _end_year then     do end_year=_end_year+1 to start_year-1;        start_year=end_year;        output;        put 'created row ' id= start_year= end_year=;     end;run;* order back data;proc sort;     by id start_year;run;`

The lag function will get you the previous value, which is what you need to work the gaps.

More on the lag function here: https://support.sas.com/documentation/cdl/en/lefunctionsref/69762/HTML/default/viewer.htm#n0l66p5oqe...

Hope it helps.

Daniel Santos @ www.cgd.pt

☑ This topic is solved.