BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GeethaMN
Obsidian | Level 7
data mx_mn_obs;
input x  flg;
cards;
1  0
2   .
0   .
4  0
3   .
5   .
6  0
4  .
7  .
;

q) the requirement is need to compare every 2 and 3 record with first record 
where flg = 0, if the value is max than every first record then flg should be 1
if it is min then flg = 0. could anyone help me

means output should be:
1 0
2 1
0 0
4 0
3 0
5 1
6 0
4 0
7 1
SAAAS
1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Here's an interpretation of what you are looking for:

 

data want;

do _n_=1 to 3;

   set have;

   if flg=0 then baseline_x = x;

   else flg = (x > baseline_x);

   output;

end;

drop baseline_x;

run;

 

Note that this is very much dependent on the data being in the right order.  Also note, it is possible that two observations within the same group of 3 could both end up with flg=1.  Or all 3 could end up with flg=0.  It all depends on the values for X:  Is X greater than or less than the X value on the first observation in the group.

View solution in original post

12 REPLIES 12
Reeza
Super User

I suspect this may not scale well, depending on how complex your actual data is.

A DoW loop is probably more efficient, but I'm not very good at those. If efficiency becomes an issue, it's something you may want to look into. This is a more 'long' way to calculate the results.

 

1. Create groups to identify the groups you want to find the min/max for.

2. Find min/max for each group using proc means.

3. Merge data from #2 with #1

4. Check if data matches max and flag accordingly. 

 

Your logic appears to simplify to if x=Max(x) for a specified group the flag =1, 0 otherwise.

 

data mx_mn_obs;
	input x flg;
	cards;
1  0
2   .
0   .
4  0
3   .
5   .
6  0
4  .
7  .
;

data groups;
	set mx_mn_obs;
	retain group;

	if flg=0 then
		group+1;
run;

proc means data=groups noprint;
	;
	by group;
	var x;
	output out=max_min_group min=min_x max=max_x;
run;

data merged;
	merge groups max_min_group;
	by group;

	if x=max_x then
		flg=1;
	else
		flg=0;
	*keep x flg;
run;
GeethaMN
Obsidian | Level 7

Thanks Reeza for taking out your time to answer but it is not giving expected output... yes do loop along with lag functions actually works but am end up getting errors.

But this is the requirement consider any values for a variable need to compare  every 2nd and 3rd  with 1st record and 5th and 6th with 4th record ... like that it goes on ... if the value is max then flag it with 1 else flag it with 0.

 

10

5     0

15   1

25

20   0

30   1

SAAAS
Reeza
Super User

@GeethaMN wrote:

Thanks Reeza for taking out your time to answer but it is not giving expected output... yes do loop along with lag functions actually works but am end up getting errors.

But this is the requirement consider any values for a variable need to compare  every 2nd and 3rd  with 1st record and 5th and 6th with 4th record ... like that it goes on ... if the value is max then flag it with 1 else flag it with 0.

 

10

5     0

15   1

25

20   0

30   1


Since SAS loops automatically in a data step there is no DO loop requirement for this data. 

I'm not sure what you mean about it errors out. As far as I can see it's a long method but still gives the correct results. I'm not sure what I'm supposed to take from the data above either.

Astounding
PROC Star

Here's an interpretation of what you are looking for:

 

data want;

do _n_=1 to 3;

   set have;

   if flg=0 then baseline_x = x;

   else flg = (x > baseline_x);

   output;

end;

drop baseline_x;

run;

 

Note that this is very much dependent on the data being in the right order.  Also note, it is possible that two observations within the same group of 3 could both end up with flg=1.  Or all 3 could end up with flg=0.  It all depends on the values for X:  Is X greater than or less than the X value on the first observation in the group.

GeethaMN
Obsidian | Level 7
Thanks Astounding.. could you please explain the logic at the step else flag = (x>base); without mentioning flag = 1 how it is giving values as 1 and 0.. so does (x>base) behaving like a Boolean operator? You solved it in very simple way.
SAAAS
Astounding
PROC Star

To answer your follow-up question:

 

flag = (x > base);

 

SAS evaluates x > base.  If true, it returns 1.  If false, it returns 0.  So FLAG receives a value of 1 or 0, depending on whether the condition being evaluated is true or false.

Ksharp
Super User
data mx_mn_obs;
input x  flg;
cards;
1  0
2   .
0   .
4  0
3   .
5   .
6  0
4  .
7  .
;
run;
data temp;
 set mx_mn_obs;
 if not missing(flg) then group+1;
run;
data want;
 set temp;
 by group;
 retain temp;
 if first.group then temp=x;
  else flg=ifn(x>temp,1,0);
drop temp group;
run;
GeethaMN
Obsidian | Level 7
Thanks Ksharp..
SAAAS
mkeintz
PROC Star

 

Your topic title says "max", but your description suggests you want to flag every observation with X greater than the group starting value.

 

This first finds only the max.  It also generalizes a little bit from yours, in that it allows for any number of incoming "flg=." records after a group starts with "flg=0":

 

data mx_mn_obs;
input x  flg @@;
cards;
1  0      2   .      0   .      4  0      3   .      5   .      6  0      4  .      7  .
;
run;


data want (drop=_:);
  do _I=1 by 1 until (last.flg=1 and flg=.);
    set mx_mn_obs;
    by flg notsorted;
    _xmax=max(x,_xmax);
  end;
  do _J=1 to _I ;
    set mx_mn_obs;
    if flg=. and x=_xmax then flg=1;
    else flg=0;
    output;
  end;
run;

 

 

Notes:

  • The first loop reads 1 group-of-interest, but in this case it is two BY-groups (or actually any number of FLG values until a group with flg=. is completed).  That provides a simple way to generate the max value.
  • The second loop rereads the data and flags the record with maximum value of X.

 

But if you don't actually need the max, but need to flag only records that have X greater than the starting value, then this simpler program works:

 

data want2;
  set mx_mn_obs;
  retain reference_val;
  if flg=0 then reference_val=x;
  if x>reference_val then flg=1;
  else flg=0;
  drop reference_val;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
GeethaMN
Obsidian | Level 7
Thanks mkeintz... yes i really don't need max values i just wanted to compare if it is greater than the first value. the second program is very simple.
SAAAS
ChrisNZ
Tourmaline | Level 20

Like this?


data WANT;
  call missing(MAX);
  do I=1 to 3;
    set HAVE ;
    MAX=MAX(MAX,X);
  end;                        
  do I=1 to 3;  
    set HAVE ;
    FLG=(X=MAX);
    output;
  end;
run;

X  FLG

1    0
2    1
0    0
4    0
3    0
5    1
6    0
4    0
7    1

GeethaMN
Obsidian | Level 7
Thanks a lot chris.
SAAAS

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 3021 views
  • 6 likes
  • 6 in conversation