DATA Step, Macro, Functions and more

solve the query using last. and first. variable concept in data step

Reply
Frequent Contributor
Posts: 76

solve the query using last. and first. variable concept in data step

I have a data set like

data test;

input name$ roll marks;

cards;

A  1  200

A  3  800

A  5  400

A  6  100

B  7   300

B  2  800

B  4  700

B  6  900

B  1  100

C  2  300

C  3  100

C  4  600

C  5  800

;

RUN;

I have to write query as per below condition using last. and first. variable

1.2nd highest mark of each student;

2.highest mark of each student.

Kindly help to resolve these query.

Regards,

Ashwini

Respected Advisor
Posts: 4,920

Re: solve the query using last. and first. variable concept in data step

Assuming you can't simply sort by name and mark descending (and keep the first two obs in each by group), you only need last.name :

data test;
input name$ roll mark;
datalines;
A  1  200
A  3  800
A  5  400
A  6  100
B  7   300
B  2  800
B  4  700
B  6  900
B  1  100
C  2  300
C  3  100
C  4  600
C  5  800
;

data want(keep=name highMark highRoll secondHighMark secondHighRoll);
do until(last.name);
set test;
by name;
if mark >= highMark then do;
  secondHighMark = highMark;
  secondHighRoll = highRoll;
  highMark = mark;
  highRoll = roll;
end;
else if mark > secondHighMark then do;
  secondHighMark = mark;
  secondHighRoll = roll;
end;
end;
run;

proc print; run;

PG

PG
N/A
Posts: 1

Re: solve the query using last. and first. variable concept in data step

Hi Ashwini, If i understand correctly, you want to find out second high score and high score of each student using First.Obs and last.Obs concept. For this, I have sorted the data by name and score in ascending order (the default working of proc sort and then used it with SET statement with BY keyword. I have made use of LAG function to accomplish the task. I hope the below program will address your query and if required with little changes. But below mentioned may not be close to ansewer but yes it may help you.

data test;

input name$ roll mark;

datalines;

A  1  200

A  3  800

A  5  400

A  6  100

B  7   300

B  2  800

B  4  700

B  6  900

B  1  100

C  2  300

C  3  100

C  4  600

C  5  800

;

run;

proc sort data = test out = test;

by name, marks;

run;

data _null_;

set test;

by name;

if last.name ;

secondhigh = lag(score);

high = score;

out put;

run;

proc print data = test;

run;

Respected Advisor
Posts: 3,799

Re: solve the query using last. and first. variable concept in data step

proc summary nway data=test;

   class name;

   output out=one2(drop=_Smiley Happy

      idgroup(max(marks) out[2](marks)=);

   run;

Respected Advisor
Posts: 4,920

Re: solve the query using last. and first. variable concept in data step

Posted in reply to data_null__

And just to show that the datastep is far from the only (or best) tool to do this :

data test;
input name$ roll mark;
datalines;
A  1  200
A  3  800
A  5  400
A  6  100
B  7  300
B  2  800
B  4  700
B  6  900
B  1  100
C  2  300
C  3  100
C  4  600
C  5  800
D  2  300
D  3  100
D  4  800
D  5  800
;

proc rank data=test descending ties=low out=rTest(where=(rank<3));
by name;
var mark;
ranks rank;
run;


proc print;run;

PG

PG
Respected Advisor
Posts: 4,920

Re: solve the query using last. and first. variable concept in data step

If you are allowed to presort then the logic is even simpler. The case of tied high marks, as for name=D below,  is not clear. If the second highest mark must be lower than the highest then uncomment nodupkey in the code :

data test;
input name$ roll mark;
datalines;
A  1  200
A  3  800
A  5  400
A  6  100
B  7   300
B  2  800
B  4  700
B  6  900
B  1  100
C  2  300
C  3  100
C  4  600
C  5  800
D  2  300
D  3  100
D  4  800
D  5  800
;

proc sort data=test /*nodupkey*/; by name descending mark; run;

data want(keep=name high: secondSmiley Happy;
do until(last.name);
set test;
by name;
if first.name then do;
  highMark = mark;
  highRoll = roll;
end;
else if missing(secondHighMark) then do;
  secondHighMark = mark;
  secondHighRoll = roll;
end;
end;
run;

proc print;run;

PG

PG
Super User
Posts: 10,023

Re: solve the query using last. and first. variable concept in data step

Your don't post what output you need.

data test;
input name$ roll mark;
datalines;
A  1  200
A  3  800
A  5  400
A  6  100
B  7   300
B  2  800
B  4  700
B  6  900
B  1  100
C  2  300
C  3  100
C  4  600
C  5  800
D  2  300
D  3  100
D  4  800
D  5  800
;
run;
 

proc sort data=test nodupkey; by name descending mark; run;
data want;
 set test;
 by name;
 if first.name then count=0;
 count+1;
 if count in (1 2);
run;


Ksharp

Valued Guide
Posts: 765

Re: solve the query using last. and first. variable concept in data step

hi ... prefer the data _null_ solution, but here a last.var solution ...

just make sure the array dimension is as large as the largest group size ...

data top_two (keep=name max two);

array x(10);

do j=1 by 1 until(last.name);

   set test;

   by name;

   x(j) = mark;

end;

max = max(of x(*));

two = largest(2,of x(*));

run;

Ask a Question
Discussion stats
  • 7 replies
  • 431 views
  • 0 likes
  • 6 in conversation