Solved: Top value in output

Kamal5522 · Posted 06-19-2019 09:18 PM

Hi SAS Community
I want only top value row in output of below dataset
Could you please what coding would be used ?
Name Marks
Rao 50
Tom 60
Matt 90

Output Required - Top Marks
Matt 90

heffo · Posted 06-19-2019 09:48 PM

If you only want one row then these two version could be ok.

data have;
	length name $ 16 marks 8;
	input Name Marks;
datalines;
Rao 50
Tom 60
Matt 90
;;;
run;

proc sql noprint;
	create table want as 
	select name, marks 
	from have
	having marks = max(marks);
quit;

or

proc sort data=have out=have_sorted;
	by name descending marks;
run;

data want;
	set have_sorted (obs=1);
run;

View solution in original post

Kamal5522 · Posted 06-19-2019 09:20 PM

I want to achieve thebdesired output using first. and last. concept

hashman · Posted 06-21-2019 03:06 AM

@Kamal5522:

Within the context of your problem, using FIRST.x/LAST.x makes sense only if you want it as a method of resolving ties - in case different persons have the same highest mark. However, doing so requires to sort the file, which makes no sense for a linear, single-pass problem. Instead, it can be done simply by:

data want (drop = _:) ;                           
  do until (z) ;                                  
    set have (rename=(name=_n marks=_m)) end = z ;
    if _m LE marks then continue ;                
    name  = _n ;                                  
    marks = _m ;                                  
  end ;                                           
run ;

In the case there are ties by MARKS, it will choose the first physical record with the highest mark. If you want the last instead, use LT instead of LE in the IF statement.

Also note that using SQL for this type of problem wastes computer resources, as it has to remerge the aggregate with the original data and hence needs two passes through the input data set. As a result, it runs 5 times slower than the DATA step above and consumes twice the memory.

Kind regards

Paul D.

Kamal5522 · Posted 06-19-2019 09:20 PM

* the desired

heffo · Posted 06-19-2019 09:48 PM

If you only want one row then these two version could be ok.

data have;
	length name $ 16 marks 8;
	input Name Marks;
datalines;
Rao 50
Tom 60
Matt 90
;;;
run;

proc sql noprint;
	create table want as 
	select name, marks 
	from have
	having marks = max(marks);
quit;

or

proc sort data=have out=have_sorted;
	by name descending marks;
run;

data want;
	set have_sorted (obs=1);
run;

Kamal5522 · Posted 06-19-2019 09:55 PM

Hi Heffo
Thanks for your suggestion
Is it possible to get the desired output by using first dot and last dot concept

heffo · Posted 06-19-2019 10:10 PM

Yes, but it makes no sense if you only want one single row.

proc sort data=have out=have_sorted;
	by name descending marks;
run;

data want;
	set have_sorted;
	by name;
	if first.name then output; *or just if first.name;;
run;

This code will give you the top value for each person. So, it is only logical if your data set contains multiple marks for each individual.

Also, doing it with the first.name you read all rows of the data in the data step, so potentially a bit slower if you have a big data set. I would say that it is bad form to do it that way, unless you have a specific reason to do it. 🙂

Top value in output

Re: Top value in output

Re: Top value in output

Re: Top value in output

Re: Top value in output

Re: Top value in output

Re: Top value in output

Re: Top value in output

Registration is open