BookmarkSubscribeRSS Feed
EMKBB
Calcite | Level 5
Hello,

I am new to SAS, and I am looking for material that will enable me to code and do the following:

I am trying to select the observations in a data set that meet the following criteria:

Variables: SA, AD, RS, SP (there are other variables but these are what is needed
to do the selection).


When SA(i) equals SA(j) and AD(i) is not equal to AD(j)
Select Observation i if AD(i) is greater than AD(j)
Select Observation j if AD(i) is less than AD(j),

When SA(i) equals SA(j) and AD(i) equals AD(j)
Select Observation i if RS(i) is greater than RS(j)
Select Observation j if RS(i) is less than RS(j),

Ignore Observation i if SP(i) equal to Yd or Tl or Ng.


Thank you.
13 REPLIES 13
Paige
Quartz | Level 8
> I am trying to select the observations in a data set
> that meet the following criteria:
>
> Variables: SA, AD, RS, SP (there are other variables
> but these are what is needed
> to do the selection).
>
>
> When SA(i) equals SA(j) and AD(i) is not equal
> to AD(j)

Not sure what this means ... normally, you don't select an observation based upon the value of the same variable in another observation. Do you mean select observation (i) when the value of SA equals the value of SA in some other observation (j), along with AD conditions?

If you really want to select an observation based upon a general condition where SA equals the value of SA in some other observation (plus AD conditions), where the other observation could be anywhere in the data set, I think you'd need PROC IML.

Am I understanding you properly? Message was edited by: Paige
EMKBB
Calcite | Level 5
Thanks Paige,

What I mean is that if two observations have the same SA value, then select the one with the higher AD value. If it turns out that their AD values are also the same, then select the one with the higher RS value.

Any observation whose SP value is equal to Yd or Tl or Ng should not be selected.

All observations which have unique SA values will be selected.

(I used i and j to identify any two given observations. I hope the situation is clearer now).

EMKBB
Paige
Quartz | Level 8
I still don't think I know enough to actually a the program to do this. Are you comparing all observations to each other? Or selected pairs (triples/quads/etc.) only? If so, how are you selecting the pairs?

But as a guess, does this meet your needs? Sort by SA AD RS. Now things are ordered such that choosing the one you want should be easy.
EMKBB
Calcite | Level 5
Paige,

Not quite, you compare the observations with each other and select the ones that meet the above criteria. Those with unique SA values will definitely be selected.
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
Given the OP input, what needs to be looked-at are FIRST. and LAST. with "BY GROUP PROCESSING", and possibly using the LAG function, or using a RETAIN statement to capture/retain the prior observation variable/value for comparison to the current variable/value. Take the challenge in smaller components, dealing with each comparison task individually, validate your own developed code (based on available DOC and forum input), and build on the process with each additional comparison rqmt.

Scott Barry
SBBWorks, Inc.
Paige
Quartz | Level 8
> Not quite, you compare the observations with each
> other and select the ones that meet the above
> criteria. Those with unique SA values will definitely
> be selected.

So please give an example, or explain more thoroughly, exactly what you are looking for. I specifically asked (and you didn't reply):

Are you comparing all observations to each other? Or selected pairs (triples/quads/etc.) only? If so, how are you selecting the pairs?
EMKBB
Calcite | Level 5
Thanks sbb, I will look it up.
Patrick
Opal | Level 21
If you want to use SQL the look up "group by" with a "having" clause
EMKBB
Calcite | Level 5
Thanks Patrick.
EMKBB
Calcite | Level 5
Sorry Paige, if I was not explanatory enough, here we go:

In the table below, Observation 1 and Observation 5 have the same SA value, but the DA value of Observation 1 is greater than that of Observation 5 so in reading the data, Observation 1 will be selected but Observation 5 will be dropped.

Again Observation 2 and Observation 13 have the same SA values, and incidentally their AD values are also the same, so here we resort to their RS values (which are not the same). The RS value of Observation 13 is higher, so it will be selected and Observation 2 dropped.

Observations with unique SA values will be selected.

Observations 4, 7, 8, 12 will not be selected because they have the values Tl, Ng, Yd and Tl respectively.

Observation SA AD RS SP
1 15 20 19 Bt
2 18 21 22 Ab
3 16 18 16 Ab
4 10 12 15 Tl
5 15 19 17 Bt
6 20 22 24 Ef
7 13 16 17 Ng
8 26 27 29 Yd
9 22 23 26 Ef
10 24 26 28 Dd
11 23 25 27 Bt
12 27 29 30 Tl
13 18 21 23 Ef
14 21 22 25 Ab
15 12 14 16 Dd
Paige
Quartz | Level 8
Okay, I understand now. What is it about my PROC SORT suggestion that doesn't work? Seems to me it puts your data in the proper order and makes it easy for you to select the ones you want.
Robert_Bardos
Fluorite | Level 6
Paige is right. And if - for some reason you didn't specify - the original observation order needs to be preserved then add some kind of observation number up front, then proceed as Paige suggested and finally sort the remaining observations back.
Ksharp
Super User
I do it roughly, And i think there is some way can optimize it.
But I have to leave for the sake of working.


Yes.Paige has the right idea here.
try it.'nodupkey' can keep the first duplicate observation.

[pre]
data temp(where=( sp not in('Yd' 'TI' 'Ng')));
input Obs SA AD RS SP $;
datalines;
1 15 20 19 Bt
2 18 21 22 Ab
3 16 18 16 Ab
4 10 12 15 Tl
5 15 19 17 Bt
6 20 22 24 Ef
7 13 16 17 Ng
8 26 27 29 Yd
9 22 23 26 Ef
10 24 26 28 Dd
11 23 25 27 Bt
12 27 29 30 Tl
13 18 21 23 Ef
14 21 22 25 Ab
15 12 14 16 Dd
;
proc sort data=temp ;
by sa descending ad descending rs;
run;
proc sort data=temp nodupkey;
by sa;
run;
proc print noobs;
run;
[/pre]



Ksharp Message was edited by: Ksharp

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 3305 views
  • 0 likes
  • 6 in conversation