BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
fwu811
Fluorite | Level 6

Below is my input data

data have;
   input Col1 $ Col2 Col3;
datalines;
A 1 2
A 3 4
A 4 8
A 6 5
B 1 .
B 2 .
B 4 .
B 5 .
;

Now I would like to fill the Col3 data for B, based on the matched Col2 for A. If any Col2 data for B are missing from A, either the max or min of Col2 values from entire records for A will be used.

Hence, the output of B should be:

  B  1  2

  B 2  8 for max or 2 for min

  B 4  8

  B 5  8 for max or 2 for min

 

Many thanks for your help.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
   input Col1 $ Col2 Col3;
datalines;
A 1 2
A 3 4
A 4 8
A 6 5
B 1 .
B 2 .
B 4 .
B 5 .
;
proc sql;
create table want as
select Col1,Col2,coalesce(
(select Col3 from have where col1='A' and col2=a.col2),
(select max(Col3) from have  where col1='A')
) as Col3
 from have as a;
quit;

View solution in original post

7 REPLIES 7
PaigeMiller
Diamond | Level 26

Find max/min grouped by data fields (I will call these fields FIELD1 and FIELD2)

 

proc summary data=have nway;
    class field1 field2;
    var yourvarname;
    output out=want min= max=/autoname;
run;

 

Replace missing data with them ... more explanation is needed. Missing data can be replaced by one value, but missing data cannot be replaced by "them" (which indicates to me replace the missing with both the max and the min).

 

In the future, please give a more complete explanation of the problem, rather than just the bare minimum that you have typed in this case.

--
Paige Miller
Tom
Super User Tom
Super User

Is there a question here?

What is your input data?

What output do you want for that particular input?
What missing data do you want to replace? What are the rules for determining what to replace it with?

fwu811
Fluorite | Level 6
Many thanks for your quick reply.

I just added an example to my question.

Thank you again.
mkeintz
PROC Star

Thank you for providing a working DATA step with sample data.  You haven't provided a rule for determining when to replace missing values of col3 with the max value vs the min, so this code always assigns the max.

 

data want (drop=_:);
  set have;
  if _n_=1 then do;
    declare hash h ();
      h.definekey('col2');
      h.definedata('col3');
      h.definedone();
  end;
  if col1='A' then do;
    h.add();
    _min=min(_min,col3);
    _max=max(_max,col3);
    retain _min _max;
  end;
  else if h.find() ^=0 then col3=_max;
run;

It assumes that all the COL='A' observations are at the beginning of the data set, and that all other obs need to use the data from 'A' to determine col3.  

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Ksharp
Super User
data have;
   input Col1 $ Col2 Col3;
datalines;
A 1 2
A 3 4
A 4 8
A 6 5
B 1 .
B 2 .
B 4 .
B 5 .
;
proc sql;
create table want as
select Col1,Col2,coalesce(
(select Col3 from have where col1='A' and col2=a.col2),
(select max(Col3) from have  where col1='A')
) as Col3
 from have as a;
quit;
fwu811
Fluorite | Level 6

Thanks both, which all work for me.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1020 views
  • 2 likes
  • 5 in conversation