DATA Step, Macro, Functions and more

Identifymultiple variables having same value for consecutive months

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 16
Accepted Solution

Identifymultiple variables having same value for consecutive months

[ Edited ]

Hi Team,

I want to identify id's having a flag=1 for two or more of the following variables in two consecutive months. The thing here is there should at least be two or more variables with flag=1, in any combination (could be var 1 and 2 or var 1 and 6 or var 1, 5, 7...), and should be present in two consecutive months. Say in the following dataset id 1 have flag=1 for var5, var6 and var7 in the 7th month, and var6 and var7 in 8th month ..hence id 1 should be present in the output.  Just one episode of such scenario is enough to be in the output dataset, disregard of the value in the remaining months.

dataset;

id   year month    var1 var2 var3 var 4 var5 var6 var7  var..........multiple

1      2012   7        0      0      0      0     1         1    1 

1      2012   8       0      0       0     0      0        1    1  

1      2012   9       1     0        0     0     0       0      0  

1      2012   10     0     0        1     0    0       0       0  

1      2012     11   0     1        0     1    0       1       0  

1      2012     12   0     0        0     0    0      0       1 

1      2013      1    0     0        1     1    1      0       0

2      2012    7        0     0       0     0    0      0       0

2      2012    8       0      0        0    0     0     0       0

2      2012    9       0     0        0     0     0     0      0

2      2012   10     0     0        0    0    0       0       0

2      2012     11   0     0        0    0    0       0       0

2      2012     12   0     0        0     1    0      1      0

2     2013      1    0     0        1      1    1      0       0

2      2013     3    1     1       0       0    0     0      0

3      2012    7        0     0       0     0    1      1       1

3      2012    8       0      0        1    1     0     0       0

3      2012    9       0     0        0     0     0     0      0

3      2012   10     0     0        1    0    0       0       0

4      2012     11   0     1        0    0    0       1       0

4      2012     12   0     0        0     0    0      0       0

4      2013      1    1    0        0      0    0     1       0

4    2013        4    0    0         1      1    0     0      0

 

Output;

id

1

2

3


Accepted Solutions
Solution
‎11-28-2017 05:49 PM
PROC Star
Posts: 1,190

Re: Identifymultiple variables having same value for consecutive months

Posted in reply to eshty_tes

ok. Makes sense then.

 

Do something like this

 

data have;
input id$ year month var1-var7;
datalines;
1 2012 7  0 0 0 0 1 1 1
1 2012 8  0 0 0 0 0 1 1
1 2012 9  1 0 0 0 0 0 0
1 2012 10 0 0 1 0 0 0 0
1 2012 11 0 1 0 1 0 1 0
1 2012 12 0 0 0 0 0 0 1
1 2013 1  0 0 1 1 1 0 0
2 2012 7  0 0 0 0 0 0 0
2 2012 8  0 0 0 0 0 0 0
2 2012 9  0 0 0 0 0 0 0
2 2012 10 0 0 0 0 0 0 0
2 2012 11 0 0 0 0 0 0 0
2 2012 12 0 0 0 1 0 1 0
2 2013 1  0 0 1 1 1 0 0
2 2013 3  1 1 0 0 0 0 0
3 2012 7  0 0 0 0 1 1 1
3 2012 8  0 0 0 0 0 0 0
3 2012 9  0 0 0 0 0 0 0
3 2012 10 0 0 1 0 0 0 0
4 2012 11 0 1 0 0 0 1 0
4 2012 12 0 0 0 0 0 0 0
4 2013 1  1 0 0 0 0 1 0
4 2013 4  0 0 1 1 0 0 0
;

proc sort data=have;
	by id year month;
run;

data want;
	set have;
	by id;

	array vars{*} var1-var7;

	monthsbetween = intck('month', lag(mdy(month, 1, year)), mdy(month, 1, year));
	if first.id then do;
		monthsbetween=.;
		outflag=0;
	end;

	if sum(of vars[*])>1 then twoormore=1;
	else twoormore=0;

	if monthsbetween=1 and twoormore=1 and lag(twoormore)=1 then outflag=1;

	retain outflag;
	keep id;
	if last.id and outflag=1;
run;

View solution in original post


All Replies
PROC Star
Posts: 1,190

Re: Identifymultiple variables having same value for consecutive months

Posted in reply to eshty_tes

I don't see how id=3 meets these criteria from your sample data? it has flag=1 for more than one variable in year 2012 month 7 only?

Occasional Contributor
Posts: 16

Re: Identifymultiple variables having same value for consecutive months

[ Edited ]

Thanks,

That is true, I missed the flag, will correct it now

Solution
‎11-28-2017 05:49 PM
PROC Star
Posts: 1,190

Re: Identifymultiple variables having same value for consecutive months

Posted in reply to eshty_tes

ok. Makes sense then.

 

Do something like this

 

data have;
input id$ year month var1-var7;
datalines;
1 2012 7  0 0 0 0 1 1 1
1 2012 8  0 0 0 0 0 1 1
1 2012 9  1 0 0 0 0 0 0
1 2012 10 0 0 1 0 0 0 0
1 2012 11 0 1 0 1 0 1 0
1 2012 12 0 0 0 0 0 0 1
1 2013 1  0 0 1 1 1 0 0
2 2012 7  0 0 0 0 0 0 0
2 2012 8  0 0 0 0 0 0 0
2 2012 9  0 0 0 0 0 0 0
2 2012 10 0 0 0 0 0 0 0
2 2012 11 0 0 0 0 0 0 0
2 2012 12 0 0 0 1 0 1 0
2 2013 1  0 0 1 1 1 0 0
2 2013 3  1 1 0 0 0 0 0
3 2012 7  0 0 0 0 1 1 1
3 2012 8  0 0 0 0 0 0 0
3 2012 9  0 0 0 0 0 0 0
3 2012 10 0 0 1 0 0 0 0
4 2012 11 0 1 0 0 0 1 0
4 2012 12 0 0 0 0 0 0 0
4 2013 1  1 0 0 0 0 1 0
4 2013 4  0 0 1 1 0 0 0
;

proc sort data=have;
	by id year month;
run;

data want;
	set have;
	by id;

	array vars{*} var1-var7;

	monthsbetween = intck('month', lag(mdy(month, 1, year)), mdy(month, 1, year));
	if first.id then do;
		monthsbetween=.;
		outflag=0;
	end;

	if sum(of vars[*])>1 then twoormore=1;
	else twoormore=0;

	if monthsbetween=1 and twoormore=1 and lag(twoormore)=1 then outflag=1;

	retain outflag;
	keep id;
	if last.id and outflag=1;
run;
Occasional Contributor
Posts: 16

Re: Identifymultiple variables having same value for consecutive months

[ Edited ]

 

That is perfect. I really appreciate your help.

What if I want to include the vars which contributed for the output dataset.

PROC Star
Posts: 1,190

Re: Identifymultiple variables having same value for consecutive months

Posted in reply to eshty_tes

Then drop the

 

keep id;

statement.

 

 

Anytime, glad to help Smiley Happy

Occasional Contributor
Posts: 16

Re: Identifymultiple variables having same value for consecutive months

Thanks draycut,

Now, I   want to keep only those id's having a flag=1 for two or more consecutive months for the same variables for the first as well as the second month. You see some id have a flag=1 for different variables in the first  and second month.

Here,

if sum(of vars[*])>1 then twoormore=1;
else twoormore=0; this counts the sum of any flag=1 month by month, but i want the sum to be of the same variables in both the first and second month.
When selecting those having  twoormore=1 for two or more consecutive months, i want only the same variables in the first month to contribute to the sum not a different variable in the second month.

Based on the above data, for instance id 1 have flag=1 for var 6 and 7  at month 7 and  8.... this perfectly applies to my case.

if you see id 2, there is a flag=1 for var 4 and 6 at month 12 2012, but  in the next month, i.e. month 1 of 2013, the value for var 6 is 0 not 1 that leads id 2 to be excluded in the output.

The output desired in the @draycut dataset have is only id '1'

 

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 172 views
  • 0 likes
  • 2 in conversation