> I am trying to select the observations in a data set
> that meet the following criteria:
> Variables: SA, AD, RS, SP (there are other variables
> but these are what is needed
> to do the selection).
> When SA(i) equals SA(j) and AD(i) is not equal
> to AD(j)
Not sure what this means ... normally, you don't select an observation based upon the value of the same variable in another observation. Do you mean select observation (i) when the value of SA equals the value of SA in some other observation (j), along with AD conditions?
If you really want to select an observation based upon a general condition where SA equals the value of SA in some other observation (plus AD conditions), where the other observation could be anywhere in the data set, I think you'd need PROC IML.
Am I understanding you properly?
Message was edited by: Paige
What I mean is that if two observations have the same SA value, then select the one with the higher AD value. If it turns out that their AD values are also the same, then select the one with the higher RS value.
Any observation whose SP value is equal to Yd or Tl or Ng should not be selected.
All observations which have unique SA values will be selected.
(I used i and j to identify any two given observations. I hope the situation is clearer now).
I still don't think I know enough to actually a the program to do this. Are you comparing all observations to each other? Or selected pairs (triples/quads/etc.) only? If so, how are you selecting the pairs?
But as a guess, does this meet your needs? Sort by SA AD RS. Now things are ordered such that choosing the one you want should be easy.
Given the OP input, what needs to be looked-at are FIRST. and LAST. with "BY GROUP PROCESSING", and possibly using the LAG function, or using a RETAIN statement to capture/retain the prior observation variable/value for comparison to the current variable/value. Take the challenge in smaller components, dealing with each comparison task individually, validate your own developed code (based on available DOC and forum input), and build on the process with each additional comparison rqmt.
Sorry Paige, if I was not explanatory enough, here we go:
In the table below, Observation 1 and Observation 5 have the same SA value, but the DA value of Observation 1 is greater than that of Observation 5 so in reading the data, Observation 1 will be selected but Observation 5 will be dropped.
Again Observation 2 and Observation 13 have the same SA values, and incidentally their AD values are also the same, so here we resort to their RS values (which are not the same). The RS value of Observation 13 is higher, so it will be selected and Observation 2 dropped.
Observations with unique SA values will be selected.
Observations 4, 7, 8, 12 will not be selected because they have the values Tl, Ng, Yd and Tl respectively.
Observation SA AD RS SP
1 15 20 19 Bt
2 18 21 22 Ab
3 16 18 16 Ab
4 10 12 15 Tl
5 15 19 17 Bt
6 20 22 24 Ef
7 13 16 17 Ng
8 26 27 29 Yd
9 22 23 26 Ef
10 24 26 28 Dd
11 23 25 27 Bt
12 27 29 30 Tl
13 18 21 23 Ef
14 21 22 25 Ab
15 12 14 16 Dd
Paige is right. And if - for some reason you didn't specify - the original observation order needs to be preserved then add some kind of observation number up front, then proceed as Paige suggested and finally sort the remaining observations back.