Fluorite | Level 6

## How to get minimum value for particular column group by with other column using data step

How to get minimum value for particular column by another column only using data step. I know how to get this using proc sql, proc means but i would like to use data step. How to get it. I have researched about it but didn't get clarity.
7 REPLIES 7
Diamond | Level 26

## Re: How to get minimum value for particular column group by with other column using data step

We would need some sample data, presented as working SAS data step code.

I am also curious why you need to do this in a DATA step, when the other solutions (PROC MEANS, PROC SQL) give correct answers with minimum effort.

--
Paige Miller
Fluorite | Level 6

## Re: How to get minimum value for particular column group by with other column using data step

We can use any of the data present in sas help library. Yes, it is easy with proc means and sql but would like try with data step also.
Diamond | Level 26

## Re: How to get minimum value for particular column group by with other column using data step

@SASCODERS wrote:
... but would like try with data step also.

Why? What is the benefit?

What have you tried so far?

--
Paige Miller
Fluorite | Level 6

## Re: How to get minimum value for particular column group by with other column using data step

I have tried
data out;
set data;
by colname
min_1 = min(colname);
run;

but it is not giving the min value of column.
Super User

## Re: How to get minimum value for particular column group by with other column using data step

MIN() works on a row, and takes the minimum for a series of variables. It does not answer the question you've asked.

I've illustrated how it would work in a data step, which also has problems as @PaigeMiller has indicated due to not accounting for missing values.

If you want the value added back into a data step to use in other operations, here are different ways to do that.

``````******************************************************;
*Add average value to a dataset;
*Solution 1 - PROC MEANS + Data step;
******************************************************;

proc means data=sashelp.class noprint;
output out=avg_values mean(height)=avg_height;
run;

data class_data;
set sashelp.class;

if _n_=1 then
set avg_values;
run;

proc print data=class;
run;

*Solution 2 - PROC SQL - note the warning in the log;
PROC SQL;
Create table class_sql as
select *, mean(height) as avg_height
from sashelp.class;
quit;

******************************************************;
*Add average value to a dataset - with grouping variables;
*Solution 1 - PROC MEANS + Data step;
******************************************************;
proc means data=sashelp.class noprint nway;
class sex;
output out=avg_values mean(height)=avg_height;
run;

*sort data before merge;
proc sort data=sashelp.class out=class;
by sex;
run;

data class_data;
merge class avg_values;
by sex;

run;

proc print data=class_data;
run;

*Solution 2 - PROC SQL - note the warning in the log;
PROC SQL;
Create table class_sql as
select *, mean(height) as avg_height
from sashelp.class
group by sex;
quit;
Footer
Super User

## Re: How to get minimum value for particular column group by with other column using data step

``````proc sort data  = sashelp.class out=class;
by sex age;
run;

data want;
set class;
if first.sex;
keep sex age;
run;``````
``````proc sort data=sashelp.class out=class;
by sex;
run;

data want;
set class;
by sex;

if first.sex then min_age = age;
if age < min_age then min_age = age;
if last.sex then output;
keep sex min_age;
run;``````

Two 'data step' methods.

@SASCODERS wrote:
How to get minimum value for particular column by another column only using data step. I know how to get this using proc sql, proc means but i would like to use data step. How to get it. I have researched about it but didn't get clarity.

Diamond | Level 26

## Re: How to get minimum value for particular column group by with other column using data step

Sorry, @Reeza but this is one of the reasons why I try to have people AVOID writing their own code to do things that SAS has already programmed for us users. Your code works on data that does not have missing values. If the data has missing values, it gives a missing value for the minimum, which I contend is the wrong answer.

I have seen this happen at my job, where someone writes their own version of calculating a statistic, they get it wrong, but they don't realize it, and then this wrong code is used in production operations, and now this incorrect answer has real consequences.

--
Paige Miller
Discussion stats
• 7 replies
• 746 views
• 0 likes
• 3 in conversation