SAS Programming

Abimal_Zippi · Posted 11-22-2017 02:08 AM

From the data mentioned below, I want to generate id’s that had a flag=1 for two or more consecutive months. As described below, some id's had a flag=1 for 12 consecutive months, only 6 consecutive months, 4 consecutive months, and finally only a month. Also some id's had a flag=1 but not consecutively, for instance at month 12 of 2013 and month 2 of 2014. The output I want is only those ID’s who had a flag=1 for two or more consecutive months.

I really appreciate your help.

Id year month flag

1 2013 7 1

1 2013 8 1

1 2013 9 1

1 2013 10 1

1 2013 11 1

1 2013 12 1

1 2014 1 1

1 2014 2 1

1 2014 3 1

1 2014 4 1

1 2014 5 1

1 2014 6 1

2 2013 8 1

2 2013 9 1

2 2013 10 1

2 2013 11 1

2 2013 12 1

2 2014 1 1

3 2014 2 1

3 2014 3 1

3 2014 4 1

3 2014 5 1

4 2013 12 1

4 2014 1 1

5 2013 9 1

5 2014 3 1

6 2013 12 1

6 2014 2 1

7 2014 2 1

8 2013 7 1

8 2013 8 1

Output

Id

1

2

3

4

8

Kurt_Bremser · Posted 11-22-2017 02:24 AM

Make use of the fine SAS date functions by converting your time data to SAS date values, and use the lag() function:

data have;
input id year month flag;
cards;
1   2013    7        1
1   2013    8        1
1   2013     9        1
1   2013     10       1
1 2013     11        1
1 2013       12        1
1 2014       1          1
1 2014       2          1
1 2014       3          1
1 2014         4        1
1 2014         5          1
1 2014         6          1
2   2013      8        1
2   2013      9        1
2   2013      10       1
2 2013     11        1
2 2013       12        1
2 2014       1          1
3 2014       2          1
3 2014       3          1
3 2014       4          1
3 2014         5          1
4 2013       12          1
4 2014       1          1
5 2013       9          1
5 2014       3        1
6 2013       12          1
6 2014     2         1
7 2014    2          1
8 2013    7          1
8 2013    8          1
;
run;

/* first, let's create SAS date values from year and month */
data int;
set have;
if flag = 1; * just to be sure;
period = mdy(month,1,year);
keep id period;
run;

/* now check if two observations within a group are only 1 month apart */
data want (keep=id);
set int;
by id period; * period to make sure that dataset is correctly ordered;
if not first.id and intck('month',lag(period),period) = 1; * this is why we created SAS date values;
run;

/* remove duplicates */
proc sort data=want nodupkey;
by id;
run;

Note how I presented your data in a data step; this enables everyone reading the post to recreate the dataset with a simple copy/paste and submit, no tedious typing from screen or reformatting needed. And it makes sure that the dataset hase the intended structure.

Also note how code looks when using the correct posting sub-window ("little running man" or {i} buttons).

Edit: fixed typo ("data functions" -> "date functions" in first line)

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

PeterClemmensen · Posted 11-22-2017 02:23 AM

Something like this

data have;
input Id$ year month flag;
datalines;
1 2013 7 1
1 2013 8 1
1 2013 9 1
1 2013 10 1
1 2013 11 1
1 2013 12 1
1 2014 1 1
1 2014 2 1
1 2014 3 1
1 2014 4 1
1 2014 5 1
1 2014 6 1
2 2013 8 1
2 2013 9 1
2 2013 10 1
2 2013 11 1
2 2013 12 1
2 2014 1 1
3 2014 2 1
3 2014 3 1
3 2014 4 1
3 2014 5 1
4 2013 12 1
4 2014 1 1
5 2013 9 1
5 2014 3 1
6 2013 12 1
6 2014 2 1
7 2014 2 1
8 2013 7 1
8 2013 8 1
;

proc sort data=have;
	by ID year month;
run;

data want(keep=ID);
	set have;
	by ID year month;

	dif=dif(month);
	if first.ID then outflag=0;
	else if flag=1 and dif in(1,-11) and outflag=0 then outflag=1;

	retain outflag 0;
	if last.ID and outflag=1;
run;

The DATA to DATA Step Macro
Blog: SASnrd

Kurt_Bremser · Posted 11-22-2017 02:28 AM

I think you also need to check for the difference in years when dif = -11, or you might stumble over a sequence

year month
2012    12
2014     1

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Abimal_Zippi · Posted 11-22-2017 02:38 AM

That is true, the main challenge to me was to make sure those who had the flag between month 12 and 1 are included. I will take care of this note. Thank you so much

Kurt_Bremser · Posted 11-22-2017 03:04 AM

@tesfuc wrote:

That is true, the main challenge to me was to make sure those who had the flag between month 12 and 1 are included. I will take care of this note. Thank you so much

That is why I recommend to use SAS date/datetime/time values wherever possible. As soon as you have them, usage of intck/intnx takes care of all the quirks of the date and time measuring systems (irregular days/month, non-decimal fractional notations like 60sec/min or 12months/year)

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

PeterClemmensen · Posted 11-22-2017 02:45 AM

@Kurt_Bremser ah yes, did not think of that 🙂 Right as usual.

@Abimal_Zippi thank you, but I recommend that you go with Kurts solution

The DATA to DATA Step Macro
Blog: SASnrd

Kurt_Bremser · Posted 11-22-2017 02:24 AM

Make use of the fine SAS date functions by converting your time data to SAS date values, and use the lag() function:

data have;
input id year month flag;
cards;
1   2013    7        1
1   2013    8        1
1   2013     9        1
1   2013     10       1
1 2013     11        1
1 2013       12        1
1 2014       1          1
1 2014       2          1
1 2014       3          1
1 2014         4        1
1 2014         5          1
1 2014         6          1
2   2013      8        1
2   2013      9        1
2   2013      10       1
2 2013     11        1
2 2013       12        1
2 2014       1          1
3 2014       2          1
3 2014       3          1
3 2014       4          1
3 2014         5          1
4 2013       12          1
4 2014       1          1
5 2013       9          1
5 2014       3        1
6 2013       12          1
6 2014     2         1
7 2014    2          1
8 2013    7          1
8 2013    8          1
;
run;

/* first, let's create SAS date values from year and month */
data int;
set have;
if flag = 1; * just to be sure;
period = mdy(month,1,year);
keep id period;
run;

/* now check if two observations within a group are only 1 month apart */
data want (keep=id);
set int;
by id period; * period to make sure that dataset is correctly ordered;
if not first.id and intck('month',lag(period),period) = 1; * this is why we created SAS date values;
run;

/* remove duplicates */
proc sort data=want nodupkey;
by id;
run;

Note how I presented your data in a data step; this enables everyone reading the post to recreate the dataset with a simple copy/paste and submit, no tedious typing from screen or reformatting needed. And it makes sure that the dataset hase the intended structure.

Also note how code looks when using the correct posting sub-window ("little running man" or {i} buttons).

Edit: fixed typo ("data functions" -> "date functions" in first line)

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Abimal_Zippi · Posted 11-23-2017 01:42 AM

Dear KurtBremser

I would like to try a different scenario, i.e. if there is a flag for three or more consecutive months, what changes should I made in the code.

From the above dataset, the desired output will be -

Id

1

2

3

Because all have flag=1 for three or more consecutive months.

I really appreciate your help again.

Kurt_Bremser · Posted 11-23-2017 02:45 AM

While the lag() functions keeps a 1-stage FIFO chain, the lag2() function keeps 2 stages. With it you can "look back" for 2 observations.

When you now retain a counter variable that is set to 1 at first.id and incremented otherwise, you can expand my condition to check for 2 observations back when counter > 2.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Abimal_Zippi · Posted 11-23-2017 03:17 AM

Thank you so much,

I just used this code but couldn't get with lag2() function...

data want_2 (keep=id);

set int;

by id period;

if first.id then cnt=0;

if not first.id and intck('month',lag(period),period)= 1 then cnt+1;

retain cnt 0;

if cnt ge 2;

run;

Kurt_Bremser · Posted 11-23-2017 03:23 AM

Don't think so complicated:

data want (keep=id);
set int;
by id period;
retain counter;
if first.id
then counter = 1;
else counter + 1;
if counter > 2 and intck('month',lag(period),period) = 1 and intck('month',lag2(period),period) = 2;
run;

Call this

retain counter;
if first.id
then counter = 1;
else counter + 1;

a "counter block"; you will use if often when doing by-group processing in a data step.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Abimal_Zippi · Posted 11-23-2017 06:20 PM

I really appreciate your help. That is perfect.

SAS Programming

Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

Re: Consecutive months

count consecutive months

counting consecutive months in sas

Consecutive Data

Get consecutive months using Date Prompt

Getting the month name

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...