## Number of observations per ID and indexing.

# Number of observations per ID and indexing.

Hello all.  I have data (dairy cattle) that look as follows:

animal          days_in_milk     milk_yield     butterfat_%     protein_%

1234567       35                        30.8               3.51                  3.11

1234567       65                        39.2               3.32                  3.09

1234567       95                        38.5               3.21                  3.02

1234568       15                        32.7               3.15                  3.06

1234568       45                        36.7               3.13                  3.06

1234568       75                        34.4               3.20                  3.01

1234568       106                      28.3               3.07                  2.56

1234569       6                          30.2               3.05                  3.10

1234569       40                        41.2               3.08                  3.12

1234569       70                        37.5               3.51                  2.98

1234569       99                        32.4               3.01                  2.99

1234569       131                      26.3               3.21                  2.98

Each cow (cows are coloured differently) has more than one observation, let us take milk yield.  Each observation is how many days she has been in milk (variable days_in_milk).   Cows can have 3, 4 or 5 (or more) records.

I have two questions:

How do I index each observation from 1 to n in enterprise guide?

How do I create a column that show the number of observations (N_obs) for each animal, like shown below:

animal          days_in_milk     milk_yield     butterfat_%     protein_%    Index    N_obs

1234567       35                        30.8               3.51                  3.11               1           3

1234567       65                        39.2               3.32                  3.09               2           3

1234567       95                        38.5               3.21                  3.02               3           3

1234568       15                        32.7               3.15                  3.06               1           4

1234568       45                        36.7               3.13                  3.06               2           4

1234568       75                        34.4               3.20                  3.01               3           4

1234568       106                      28.3               3.07                  2.56               4           4

1234569       6                          30.2               3.05                  3.10               1           5

1234569       40                        41.2               3.08                  3.12               2           5

1234569       70                        37.5               3.51                  2.98               3           5

1234569       99                        32.4               3.01                  2.99               4           5

1234569       131                      26.3               3.21                  2.98               5           5

‎04-20-2017 01:18 PM
## Re: Number of observations per ID and indexing.

## Re: Number of observations per ID and indexing.

Let me post it again: Post test data in the form of a datastep!!!

At a guess:

```data want;
set have;
by animal;
if first.animal then index=1;
else index=index+1;
run;
```

Then if you want maximum obs you have various methods of getting that - proc sort:

```proc sort data=want;
by animal ascending index;
run;
data want;
set want;
by animal index;
retain n_obs;
if first.animal then n_obs=index;
run;```

You could also do it in SQL, you coulde do a hash etc. Loads of ways.  What is your actual question?

## Re: Number of observations per ID and indexing.

Thank you very much.  I will try that.

What I actually want to achieve is to discard animals with a minimum number of records, say for instance 8 record.

Thank you very much in advance.  Very much appreciated!

‎04-20-2017 01:18 PM
## Re: Number of observations per ID and indexing.

data want;

N_obs=0;

do until (last.animal);

set have;

by animal;

n_obs + 1;

end;

index=0;

do until (last.animal);

set have;

by animal;

index + 1;

output;

end;

run;

Regarding  your added comment above, the OUTPUT statement is flexible.  It could, for example, be changed to:

if n_obs > 8 then output;

## Re: Number of observations per ID and indexing.

Thank you very much! This worked great! I really appreciate your input!
