Desktop productivity for business analysts and programmers

Number of observations per ID and indexing.

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 8
Accepted Solution

Number of observations per ID and indexing.

[ Edited ]

Hello all.  I have data (dairy cattle) that look as follows:

 

animal          days_in_milk     milk_yield     butterfat_%     protein_%

1234567       35                        30.8               3.51                  3.11

1234567       65                        39.2               3.32                  3.09

1234567       95                        38.5               3.21                  3.02

1234568       15                        32.7               3.15                  3.06

1234568       45                        36.7               3.13                  3.06

1234568       75                        34.4               3.20                  3.01

1234568       106                      28.3               3.07                  2.56

1234569       6                          30.2               3.05                  3.10

1234569       40                        41.2               3.08                  3.12

1234569       70                        37.5               3.51                  2.98 

1234569       99                        32.4               3.01                  2.99

1234569       131                      26.3               3.21                  2.98

 

Each cow (cows are coloured differently) has more than one observation, let us take milk yield.  Each observation is how many days she has been in milk (variable days_in_milk).   Cows can have 3, 4 or 5 (or more) records.

 

I have two questions:

 

How do I index each observation from 1 to n in enterprise guide?

 

How do I create a column that show the number of observations (N_obs) for each animal, like shown below:

 

animal          days_in_milk     milk_yield     butterfat_%     protein_%    Index    N_obs

1234567       35                        30.8               3.51                  3.11               1           3

1234567       65                        39.2               3.32                  3.09               2           3

1234567       95                        38.5               3.21                  3.02               3           3

1234568       15                        32.7               3.15                  3.06               1           4

1234568       45                        36.7               3.13                  3.06               2           4

1234568       75                        34.4               3.20                  3.01               3           4

1234568       106                      28.3               3.07                  2.56               4           4

1234569       6                          30.2               3.05                  3.10               1           5

1234569       40                        41.2               3.08                  3.12               2           5

1234569       70                        37.5               3.51                  2.98               3           5

1234569       99                        32.4               3.01                  2.99               4           5

1234569       131                      26.3               3.21                  2.98               5           5

 

Please help. :-)


Accepted Solutions
Solution
‎04-20-2017 01:18 PM
Respected Advisor
Posts: 4,998

Re: Number of observations per ID and indexing.

[ Edited ]

Assuming your data set is already in sorted order:

 

data want;

N_obs=0;

do until (last.animal);

   set have;

   by animal;

   n_obs + 1;

end;

index=0;

do until (last.animal);

   set have;

   by animal;

   index + 1;

   output;

end;

run;

 

Regarding  your added comment above, the OUTPUT statement is flexible.  It could, for example, be changed to:

 

if n_obs > 8 then output;

View solution in original post


All Replies
Esteemed Advisor
Esteemed Advisor
Posts: 7,249

Re: Number of observations per ID and indexing.

Let me post it again: Post test data in the form of a datastep!!!

 

At a guess:

data want;
  set have;
  by animal;
  if first.animal then index=1;
  else index=index+1;
run;

Then if you want maximum obs you have various methods of getting that - proc sort:

proc sort data=want;
  by animal ascending index;
run;
data want;
  set want;
  by animal index;
  retain n_obs;
  if first.animal then n_obs=index;
run;

You could also do it in SQL, you coulde do a hash etc. Loads of ways.  What is your actual question?

Occasional Contributor
Posts: 8

Re: Number of observations per ID and indexing.

Thank you very much.  I will try that. 

 

What I actually want to achieve is to discard animals with a minimum number of records, say for instance 8 record.

 

Thank you very much in advance.  Very much appreciated! Smiley Happy

Solution
‎04-20-2017 01:18 PM
Respected Advisor
Posts: 4,998

Re: Number of observations per ID and indexing.

[ Edited ]

Assuming your data set is already in sorted order:

 

data want;

N_obs=0;

do until (last.animal);

   set have;

   by animal;

   n_obs + 1;

end;

index=0;

do until (last.animal);

   set have;

   by animal;

   index + 1;

   output;

end;

run;

 

Regarding  your added comment above, the OUTPUT statement is flexible.  It could, for example, be changed to:

 

if n_obs > 8 then output;

Occasional Contributor
Posts: 8

Re: Number of observations per ID and indexing.

Thank you very much! This worked great! I really appreciate your input!
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 162 views
  • 2 likes
  • 3 in conversation