Solved: Re: Find max/min grouped by two data fields and replace the missing da...

fwu811 · Posted 12-25-2023 01:13 PM

Below is my input data

data have;
   input Col1 $ Col2 Col3;
datalines;
A 1 2
A 3 4
A 4 8
A 6 5
B 1 .
B 2 .
B 4 .
B 5 .
;

Now I would like to fill the Col3 data for B, based on the matched Col2 for A. If any Col2 data for B are missing from A, either the max or min of Col2 values from entire records for A will be used.

Hence, the output of B should be:

B 1 2

B 2 8 for max or 2 for min

B 4 8

B 5 8 for max or 2 for min

Many thanks for your help.

Ksharp · Posted 12-25-2023 08:32 PM

data have;
   input Col1 $ Col2 Col3;
datalines;
A 1 2
A 3 4
A 4 8
A 6 5
B 1 .
B 2 .
B 4 .
B 5 .
;
proc sql;
create table want as
select Col1,Col2,coalesce(
(select Col3 from have where col1='A' and col2=a.col2),
(select max(Col3) from have  where col1='A')
) as Col3
 from have as a;
quit;

View solution in original post

PaigeMiller · Posted 12-25-2023 01:59 PM

Find max/min grouped by data fields (I will call these fields FIELD1 and FIELD2)

proc summary data=have nway;
    class field1 field2;
    var yourvarname;
    output out=want min= max=/autoname;
run;

Replace missing data with them ... more explanation is needed. Missing data can be replaced by one value, but missing data cannot be replaced by "them" (which indicates to me replace the missing with both the max and the min).

In the future, please give a more complete explanation of the problem, rather than just the bare minimum that you have typed in this case.

--
Paige Miller

Tom · Posted 12-25-2023 02:02 PM

Is there a question here?

What is your input data?

What output do you want for that particular input?
What missing data do you want to replace? What are the rules for determining what to replace it with?

fwu811 · Posted 12-25-2023 02:50 PM

Many thanks for your quick reply.

I just added an example to my question.

Thank you again.

mkeintz · Posted 12-25-2023 03:40 PM

Thank you for providing a working DATA step with sample data. You haven't provided a rule for determining when to replace missing values of col3 with the max value vs the min, so this code always assigns the max.

data want (drop=_:);
  set have;
  if _n_=1 then do;
    declare hash h ();
      h.definekey('col2');
      h.definedata('col3');
      h.definedone();
  end;
  if col1='A' then do;
    h.add();
    _min=min(_min,col3);
    _max=max(_max,col3);
    retain _min _max;
  end;
  else if h.find() ^=0 then col3=_max;
run;

It assumes that all the COL='A' observations are at the beginning of the data set, and that all other obs need to use the data from 'A' to determine col3.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

fwu811 · Posted 12-26-2023 02:25 PM

Yes, and the code works.

Ksharp · Posted 12-25-2023 08:32 PM

data have;
   input Col1 $ Col2 Col3;
datalines;
A 1 2
A 3 4
A 4 8
A 6 5
B 1 .
B 2 .
B 4 .
B 5 .
;
proc sql;
create table want as
select Col1,Col2,coalesce(
(select Col3 from have where col1='A' and col2=a.col2),
(select max(Col3) from have  where col1='A')
) as Col3
 from have as a;
quit;

fwu811 · Posted 12-26-2023 02:24 PM

Thanks both, which all work for me.

Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Re: Find max/min grouped by two data fields and replace the missing data with them

Registration is open

SAS Training: Just a Click Away