DATA Step, Macro, Functions and more

flagging for max and min values within a variable

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

flagging for max and min values within a variable

data mx_mn_obs;
input x  flg;
cards;
1  0
2   .
0   .
4  0
3   .
5   .
6  0
4  .
7  .
;

q) the requirement is need to compare every 2 and 3 record with first record 
where flg = 0, if the value is max than every first record then flg should be 1
if it is min then flg = 0. could anyone help me

means output should be:
1 0
2 1
0 0
4 0
3 0
5 1
6 0
4 0
7 1

Accepted Solutions
Solution
‎06-29-2017 09:32 PM
Super User
Posts: 5,093

Re: flagging for max and min values within a variable

Here's an interpretation of what you are looking for:

 

data want;

do _n_=1 to 3;

   set have;

   if flg=0 then baseline_x = x;

   else flg = (x > baseline_x);

   output;

end;

drop baseline_x;

run;

 

Note that this is very much dependent on the data being in the right order.  Also note, it is possible that two observations within the same group of 3 could both end up with flg=1.  Or all 3 could end up with flg=0.  It all depends on the values for X:  Is X greater than or less than the X value on the first observation in the group.

View solution in original post


All Replies
Super User
Posts: 17,912

Re: flagging for max and min values within a variable

I suspect this may not scale well, depending on how complex your actual data is.

A DoW loop is probably more efficient, but I'm not very good at those. If efficiency becomes an issue, it's something you may want to look into. This is a more 'long' way to calculate the results.

 

1. Create groups to identify the groups you want to find the min/max for.

2. Find min/max for each group using proc means.

3. Merge data from #2 with #1

4. Check if data matches max and flag accordingly. 

 

Your logic appears to simplify to if x=Max(x) for a specified group the flag =1, 0 otherwise.

 

data mx_mn_obs;
	input x flg;
	cards;
1  0
2   .
0   .
4  0
3   .
5   .
6  0
4  .
7  .
;

data groups;
	set mx_mn_obs;
	retain group;

	if flg=0 then
		group+1;
run;

proc means data=groups noprint;
	;
	by group;
	var x;
	output out=max_min_group min=min_x max=max_x;
run;

data merged;
	merge groups max_min_group;
	by group;

	if x=max_x then
		flg=1;
	else
		flg=0;
	*keep x flg;
run;
Occasional Contributor
Posts: 12

Re: flagging for max and min values within a variable

Thanks Reeza for taking out your time to answer but it is not giving expected output... yes do loop along with lag functions actually works but am end up getting errors.

But this is the requirement consider any values for a variable need to compare  every 2nd and 3rd  with 1st record and 5th and 6th with 4th record ... like that it goes on ... if the value is max then flag it with 1 else flag it with 0.

 

10

5     0

15   1

25

20   0

30   1

Super User
Posts: 17,912

Re: flagging for max and min values within a variable


GeethaMN wrote:

Thanks Reeza for taking out your time to answer but it is not giving expected output... yes do loop along with lag functions actually works but am end up getting errors.

But this is the requirement consider any values for a variable need to compare  every 2nd and 3rd  with 1st record and 5th and 6th with 4th record ... like that it goes on ... if the value is max then flag it with 1 else flag it with 0.

 

10

5     0

15   1

25

20   0

30   1


Since SAS loops automatically in a data step there is no DO loop requirement for this data. 

I'm not sure what you mean about it errors out. As far as I can see it's a long method but still gives the correct results. I'm not sure what I'm supposed to take from the data above either.

Solution
‎06-29-2017 09:32 PM
Super User
Posts: 5,093

Re: flagging for max and min values within a variable

Here's an interpretation of what you are looking for:

 

data want;

do _n_=1 to 3;

   set have;

   if flg=0 then baseline_x = x;

   else flg = (x > baseline_x);

   output;

end;

drop baseline_x;

run;

 

Note that this is very much dependent on the data being in the right order.  Also note, it is possible that two observations within the same group of 3 could both end up with flg=1.  Or all 3 could end up with flg=0.  It all depends on the values for X:  Is X greater than or less than the X value on the first observation in the group.

Occasional Contributor
Posts: 12

Re: flagging for max and min values within a variable

Thanks Astounding.. could you please explain the logic at the step else flag = (x>base); without mentioning flag = 1 how it is giving values as 1 and 0.. so does (x>base) behaving like a Boolean operator? You solved it in very simple way.
Super User
Posts: 5,093

Re: flagging for max and min values within a variable

To answer your follow-up question:

 

flag = (x > base);

 

SAS evaluates x > base.  If true, it returns 1.  If false, it returns 0.  So FLAG receives a value of 1 or 0, depending on whether the condition being evaluated is true or false.

Super User
Posts: 9,687

Re: flagging for max and min values within a variable

data mx_mn_obs;
input x  flg;
cards;
1  0
2   .
0   .
4  0
3   .
5   .
6  0
4  .
7  .
;
run;
data temp;
 set mx_mn_obs;
 if not missing(flg) then group+1;
run;
data want;
 set temp;
 by group;
 retain temp;
 if first.group then temp=x;
  else flg=ifn(x>temp,1,0);
drop temp group;
run;
Occasional Contributor
Posts: 12

Re: flagging for max and min values within a variable

Thanks Ksharp..
Valued Guide
Posts: 797

Re: flagging for max and min values within a variable

 

Your topic title says "max", but your description suggests you want to flag every observation with X greater than the group starting value.

 

This first finds only the max.  It also generalizes a little bit from yours, in that it allows for any number of incoming "flg=." records after a group starts with "flg=0":

 

data mx_mn_obs;
input x  flg @@;
cards;
1  0      2   .      0   .      4  0      3   .      5   .      6  0      4  .      7  .
;
run;


data want (drop=_:);
  do _I=1 by 1 until (last.flg=1 and flg=.);
    set mx_mn_obs;
    by flg notsorted;
    _xmax=max(x,_xmax);
  end;
  do _J=1 to _I ;
    set mx_mn_obs;
    if flg=. and x=_xmax then flg=1;
    else flg=0;
    output;
  end;
run;

 

 

Notes:

  • The first loop reads 1 group-of-interest, but in this case it is two BY-groups (or actually any number of FLG values until a group with flg=. is completed).  That provides a simple way to generate the max value.
  • The second loop rereads the data and flags the record with maximum value of X.

 

But if you don't actually need the max, but need to flag only records that have X greater than the starting value, then this simpler program works:

 

data want2;
  set mx_mn_obs;
  retain reference_val;
  if flg=0 then reference_val=x;
  if x>reference_val then flg=1;
  else flg=0;
  drop reference_val;
run;
Occasional Contributor
Posts: 12

Re: flagging for max and min values within a variable

Thanks mkeintz... yes i really don't need max values i just wanted to compare if it is greater than the first value. the second program is very simple.
PROC Star
Posts: 1,565

Re: flagging for max and min values within a variable

[ Edited ]

Like this?


data WANT;
  call missing(MAX);
  do I=1 to 3;
    set HAVE ;
    MAX=MAX(MAX,X);
  end;                        
  do I=1 to 3;  
    set HAVE ;
    FLG=(X=MAX);
    output;
  end;
run;

X  FLG

1    0
2    1
0    0
4    0
3    0
5    1
6    0
4    0
7    1

Occasional Contributor
Posts: 12

Re: flagging for max and min values within a variable

Thanks a lot chris.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 187 views
  • 6 likes
  • 6 in conversation