BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Abimal_Zippi
Fluorite | Level 6

Hi Team,

I want to identify id's having a flag=1 for two or more of the following variables in two consecutive months. The thing here is there should at least be two or more variables with flag=1, in any combination (could be var 1 and 2 or var 1 and 6 or var 1, 5, 7...), and should be present in two consecutive months. Say in the following dataset id 1 have flag=1 for var5, var6 and var7 in the 7th month, and var6 and var7 in 8th month ..hence id 1 should be present in the output.  Just one episode of such scenario is enough to be in the output dataset, disregard of the value in the remaining months.

dataset;

id   year month    var1 var2 var3 var 4 var5 var6 var7  var..........multiple

1      2012   7        0      0      0      0     1         1    1 

1      2012   8       0      0       0     0      0        1    1  

1      2012   9       1     0        0     0     0       0      0  

1      2012   10     0     0        1     0    0       0       0  

1      2012     11   0     1        0     1    0       1       0  

1      2012     12   0     0        0     0    0      0       1 

1      2013      1    0     0        1     1    1      0       0

2      2012    7        0     0       0     0    0      0       0

2      2012    8       0      0        0    0     0     0       0

2      2012    9       0     0        0     0     0     0      0

2      2012   10     0     0        0    0    0       0       0

2      2012     11   0     0        0    0    0       0       0

2      2012     12   0     0        0     1    0      1      0

2     2013      1    0     0        1      1    1      0       0

2      2013     3    1     1       0       0    0     0      0

3      2012    7        0     0       0     0    1      1       1

3      2012    8       0      0        1    1     0     0       0

3      2012    9       0     0        0     0     0     0      0

3      2012   10     0     0        1    0    0       0       0

4      2012     11   0     1        0    0    0       1       0

4      2012     12   0     0        0     0    0      0       0

4      2013      1    1    0        0      0    0     1       0

4    2013        4    0    0         1      1    0     0      0

 

Output;

id

1

2

3

1 ACCEPTED SOLUTION

Accepted Solutions
PeterClemmensen
Tourmaline | Level 20

ok. Makes sense then.

 

Do something like this

 

data have;
input id$ year month var1-var7;
datalines;
1 2012 7  0 0 0 0 1 1 1
1 2012 8  0 0 0 0 0 1 1
1 2012 9  1 0 0 0 0 0 0
1 2012 10 0 0 1 0 0 0 0
1 2012 11 0 1 0 1 0 1 0
1 2012 12 0 0 0 0 0 0 1
1 2013 1  0 0 1 1 1 0 0
2 2012 7  0 0 0 0 0 0 0
2 2012 8  0 0 0 0 0 0 0
2 2012 9  0 0 0 0 0 0 0
2 2012 10 0 0 0 0 0 0 0
2 2012 11 0 0 0 0 0 0 0
2 2012 12 0 0 0 1 0 1 0
2 2013 1  0 0 1 1 1 0 0
2 2013 3  1 1 0 0 0 0 0
3 2012 7  0 0 0 0 1 1 1
3 2012 8  0 0 0 0 0 0 0
3 2012 9  0 0 0 0 0 0 0
3 2012 10 0 0 1 0 0 0 0
4 2012 11 0 1 0 0 0 1 0
4 2012 12 0 0 0 0 0 0 0
4 2013 1  1 0 0 0 0 1 0
4 2013 4  0 0 1 1 0 0 0
;

proc sort data=have;
	by id year month;
run;

data want;
	set have;
	by id;

	array vars{*} var1-var7;

	monthsbetween = intck('month', lag(mdy(month, 1, year)), mdy(month, 1, year));
	if first.id then do;
		monthsbetween=.;
		outflag=0;
	end;

	if sum(of vars[*])>1 then twoormore=1;
	else twoormore=0;

	if monthsbetween=1 and twoormore=1 and lag(twoormore)=1 then outflag=1;

	retain outflag;
	keep id;
	if last.id and outflag=1;
run;

View solution in original post

6 REPLIES 6
PeterClemmensen
Tourmaline | Level 20

I don't see how id=3 meets these criteria from your sample data? it has flag=1 for more than one variable in year 2012 month 7 only?

Abimal_Zippi
Fluorite | Level 6

Thanks,

That is true, I missed the flag, will correct it now

PeterClemmensen
Tourmaline | Level 20

ok. Makes sense then.

 

Do something like this

 

data have;
input id$ year month var1-var7;
datalines;
1 2012 7  0 0 0 0 1 1 1
1 2012 8  0 0 0 0 0 1 1
1 2012 9  1 0 0 0 0 0 0
1 2012 10 0 0 1 0 0 0 0
1 2012 11 0 1 0 1 0 1 0
1 2012 12 0 0 0 0 0 0 1
1 2013 1  0 0 1 1 1 0 0
2 2012 7  0 0 0 0 0 0 0
2 2012 8  0 0 0 0 0 0 0
2 2012 9  0 0 0 0 0 0 0
2 2012 10 0 0 0 0 0 0 0
2 2012 11 0 0 0 0 0 0 0
2 2012 12 0 0 0 1 0 1 0
2 2013 1  0 0 1 1 1 0 0
2 2013 3  1 1 0 0 0 0 0
3 2012 7  0 0 0 0 1 1 1
3 2012 8  0 0 0 0 0 0 0
3 2012 9  0 0 0 0 0 0 0
3 2012 10 0 0 1 0 0 0 0
4 2012 11 0 1 0 0 0 1 0
4 2012 12 0 0 0 0 0 0 0
4 2013 1  1 0 0 0 0 1 0
4 2013 4  0 0 1 1 0 0 0
;

proc sort data=have;
	by id year month;
run;

data want;
	set have;
	by id;

	array vars{*} var1-var7;

	monthsbetween = intck('month', lag(mdy(month, 1, year)), mdy(month, 1, year));
	if first.id then do;
		monthsbetween=.;
		outflag=0;
	end;

	if sum(of vars[*])>1 then twoormore=1;
	else twoormore=0;

	if monthsbetween=1 and twoormore=1 and lag(twoormore)=1 then outflag=1;

	retain outflag;
	keep id;
	if last.id and outflag=1;
run;
Abimal_Zippi
Fluorite | Level 6

 

That is perfect. I really appreciate your help.

What if I want to include the vars which contributed for the output dataset.

PeterClemmensen
Tourmaline | Level 20

Then drop the

 

keep id;

statement.

 

 

Anytime, glad to help 🙂

Abimal_Zippi
Fluorite | Level 6

Thanks draycut,

Now, I   want to keep only those id's having a flag=1 for two or more consecutive months for the same variables for the first as well as the second month. You see some id have a flag=1 for different variables in the first  and second month.

Here,

if sum(of vars[*])>1 then twoormore=1;
else twoormore=0; this counts the sum of any flag=1 month by month, but i want the sum to be of the same variables in both the first and second month.
When selecting those having  twoormore=1 for two or more consecutive months, i want only the same variables in the first month to contribute to the sum not a different variable in the second month.

Based on the above data, for instance id 1 have flag=1 for var 6 and 7  at month 7 and  8.... this perfectly applies to my case.

if you see id 2, there is a flag=1 for var 4 and 6 at month 12 2012, but  in the next month, i.e. month 1 of 2013, the value for var 6 is 0 not 1 that leads id 2 to be excluded in the output.

The output desired in the @PeterClemmensen dataset have is only id '1'

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 820 views
  • 0 likes
  • 2 in conversation