## Keep only first N observations for certain by variables

Solved
Frequent Contributor
Posts: 101

# Keep only first N observations for certain by variables

Hello All,

I have data with many variables and many observations, and I want to know how to keep the first n observations of any configurations that I choose to make the data based on by variables.

If I sort the data by (name) then I want the first n observations by name, but if I sort the data by (name date) then I want the first n observations for each name on a given date.  I am hoping that this makes sense.

Obs                    name                         date

1                              a                              x

2                              a                              x

3                              a                              x

4                              a                              x

5                              a                              y

6                              a                              y

7                              b                              x

8                              b                              x

9                              b                              y

10                            b                              y

11                            b                              y

So if I chose my by variable to be just (name) and my n to be 2, then I would retain observations 1, 2, 7 and 8.  But if I chose the by variables to be (name, date) and my n to be two, then I would retain observations 1, 2, 5, 6, 7, 8, 9 and 10.

Any help is greatly appreciated!!!

Thanks,

John

Accepted Solutions
Solution
‎12-17-2014 03:29 PM
Super User
Posts: 23,724

All Replies
Super User
Posts: 5,878

## Re: Keep only first N observations for certain by variables

Just use data step with by, and restart a counter each time you encounter a new BY-group.

And pair this with a conditional/explicit output statement.

You could probably quite easy embed this in a macro (if you or someone that you can get help from knows macro programming) to make it easy for you to change the rules in each run.

Data never sleeps
Frequent Contributor
Posts: 101

## Re: Keep only first N observations for certain by variables

Hey ,

How would I get the counter to restart at the beginning of a by variable?

This is what I originally tried to do but couldn't figure it out.

Thanks!

Solution
‎12-17-2014 03:29 PM
Super User
Posts: 23,724

## Re: Keep only first N observations for certain by variables

Frequent Contributor
Posts: 101

Thank you !!!

Posts: 3,852

## Re: Keep only first N observations for certain by variables

This is somewhat generic and uses and array of first dot variables.  Poor choice of example data but you get the idea..

%let obs=2;
%let data=class;
%let by=sex age;

proc sort data=sashelp.&data out=&data;
by &by;
run;
proc print;

run;
data keepobs;
set &data;
by &by;
array _by
• 'first.'n:;
if _by[dim(_by)] then c = 0;
c +
1;

if c le &obs then output;

run;
proc print;

run;

Posts: 3,167

## Re: Keep only first N observations for certain by variables

Nice! Didn't know you can refer those automatic variables using literals, and first time I see those variables to be array elements! Happy to learn! Thanks for sharing, John.

Haikuo

Posts: 3,852

## Re: Keep only first N observations for certain by variables

You have to use VALIDVARNAME=ANY to refer to them as in the example.  If not you can use FIRST: but that is not so safe.  Sometimes we forget that ARRAYs are just variable lists FIRST dot this or that are just variables.

It is not my original idea.  I have no original ideas.

Super User
Posts: 23,724

## Re: Keep only first N observations for certain by variables

But do you ever forget anything

Super User
Posts: 10,778

## Re: Keep only first N observations for certain by variables

```%let obs=2;
%let data=class;
%let by=sex age;

proc sort data=sashelp.&data out=&data;
by &by;
run;
proc print;
run;
data keepobs;
set &data;
by &by;
if first.%scan(&by,-1) then c = 0;
c + 1;
if c le &obs then output;
run;
proc print;
run;

```

Xia Keshan

Posts: 3,852

## Re: Keep only first N observations for certain by variables

What's the fun in that? :smileysilly:

Super User
Posts: 10,778