Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- Programming
- /
- Re: How to get the observations for subsequent values based on a condi...

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 11-08-2019 09:28 AM
(522 views)

Can anyone help how to get the observations based on the below condition:

If 3 consecutive "N" is found in column C for a same value in A column

Eg: A1B4 A1B5 and A1B6 has 3 consecutive N's and I need all three in the output

```
A B C
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
1 6 N
1 7 Y
1 8 Y
1 9 N
2 1 Y
2 2 N
2 3 N
2 4 N
2 5 N
2 6 N
2 7 N
2 8 Y
```

Expected output:

```
A B C
1 4 N
1 5 N
1 6 N
2 2 N
2 3 N
2 4 N
2 5 N
2 6 N
2 7 N
```

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
Data have;
input A B C $;
cards;
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
1 6 N
1 7 Y
1 8 Y
1 9 N
2 1 Y
2 2 N
2 3 N
2 4 N
2 5 N
2 6 N
2 7 N
2 8 Y
;
run;
data temp;
set have;
by a c notsorted;
group+first.c;
run;
proc sql;
create table want as
select * from temp group by group having count(*)>2 and sum(c='Y')=0;
quit;
```

5 REPLIES 5

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

One approach:

data want;

k=0;

do until (last.c) ;

set have;

by a c notsorted;

k + 1;

end;

do until (last.c) ;

set have;

by a c notsorted;

if k >= 3 and c='N' then output;

end;

drop k;

run;

data want;

k=0;

do until (last.c) ;

set have;

by a c notsorted;

k + 1;

end;

do until (last.c) ;

set have;

by a c notsorted;

if k >= 3 and c='N' then output;

end;

drop k;

run;

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Hi @jksthomas

```
data have;
input A B C $1.;
cards;
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
1 6 N
1 7 Y
1 8 Y
1 9 N
2 1 Y
2 2 N
2 3 N
2 4 N
2 5 N
2 6 N
2 7 N
2 8 Y
;
data want;
if _n_=1 then do;
dcl hash H ();
h.definekey ("_n_");
h.definedata ("_s",'_e');
h.definedone ();
dcl hiter hi('h');
end;
do _n_=1 by 1 until(last.a);
set have;
by a;
if c='N' then do;
_count+1;
if _count=1 then _s=B;
_e=b;
end;
else do;
if _count>=3 then do;
h.add();
end;
_count=0;
end;
end;
c='N';
do while(hi.next()=0);
do b=_s to _e;
output;
end;
end;
h.clear();
drop _:;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
Data have;
input A B C $;
cards;
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
1 6 N
1 7 Y
1 8 Y
1 9 N
2 1 Y
2 2 N
2 3 N
2 4 N
2 5 N
2 6 N
2 7 N
2 8 Y
;
run;
data want;
cnt=0;
do until(last.c);
set have;
by c notsorted;
cnt+1;
end;
do until(last.c);
set have;
by c notsorted;
if cnt >=3 then output;
end;
drop cnt;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
Data have;
input A B C $;
cards;
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
1 6 N
1 7 Y
1 8 Y
1 9 N
2 1 Y
2 2 N
2 3 N
2 4 N
2 5 N
2 6 N
2 7 N
2 8 Y
;
run;
data temp;
set have;
by a c notsorted;
group+first.c;
run;
proc sql;
create table want as
select * from temp group by group having count(*)>2 and sum(c='Y')=0;
quit;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

You've got plenty of good offers, of which I'd prefer those based on the double DoW-loop and grouping by A C.

Another approach can be:

While passing through the file, memorize the observation number endpoints of each consecutive "N" sequence

At the break of the sequence (i.e. when C="N" or LAST.A), if the distance between the endpoints GE 2, use them to output the needed records.

Note that below, I've included an extra BY-group (A=2) to test the condition when a qualifying "N" sequence ends at the end of a BY A group.

```
data have ;
input (A B C) (:$1.) ;
cards ;
1 1 N
1 2 N
1 3 Y
1 4 N
1 5 N
1 6 N
1 7 Y
1 8 Y
1 9 N
2 0 N
2 1 N
2 2 Y
2 3 N
2 4 N
2 5 N
2 6 Y
2 7 N
2 8 N
2 9 N
3 1 Y
3 2 N
3 3 N
3 4 N
3 5 N
3 6 N
3 7 N
3 8 Y
3 8 Y
;
run ;
data want ;
set have ;
by a ;
if c eq "N" and not _iorc_ then _iorc_ = _n_ ;
if (c ne "N" or last.a) and _iorc_ ;
if _n_ - not last.a - _iorc_ => 2 then do p = _iorc_ to _n_ - not last.a ;
set have point = p ;
output ;
end ;
_iorc_ = 0 ;
run ;
```

_IORC_ (auto-retained, auto-dropped, and set to 0 at compile) tracks the record numbers where the "N" sequences begin; _N_ tracks the record numbers where they terminate. This approach can be advantageous performance-wise if the number and size of the qualifying "N" sequences are small compared to the size of the file, as it thus avoids a full second read.

Kind regards

Paul D.

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. **Registration is now open through August 30th**. Visit the SAS Hackathon homepage.

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.