I have a dataset looks as the following:
Name Score
A 766
A 9
A 93
A 869
A 143
B 8
B 119
B 362
B 6
B 95
C 524
C 99
C 123
C 176
C 210
D 8
D 0
D 31
D 123
D 12
I need to set the score value to do the following:
1. if the score value less than 10, set the score to missing
2. if there is only one score less than 10 for each name group, then set the second minimum score to missing. (that's where I am stuck. )
for example, the above table:
for Name A, 9 and 93 should set to missing;
for Name B, 8 and 6 should set to missing ...
BTW, the number of each name is always 5.
If the order matters you'll want to add an indicator to keep the row number and then resort the data after, if it doesn't this works:
data have;
input Name $ Score;
cards;
A 766
A 9
A 93
A 869
A 143
B 8
B 119
B 362
B 6
B 95
C 524
C 99
C 123
C 176
C 210
D 8
D 0
D 31
D 123
D 12
;
run;
proc sort data=have;
by name score;
run;
data want;
set have;
by name;
if first.name then less10=0;
if score<10 then do;
score=.;
less10+1;
end;
else if score>=10 and less10=1 then do;
score=.;
less10+1;
end;
run;
Thanks. but i don't think this will work.
"else if score>=10 and less10=1 then do"
How can less10 be 1?
Try it.
Less10 is a variable that is incremented every time you set a variable to missing in the first IF condition.
The second IF condition checks if the number set to missing is 1, though you may want to change that to <=1 incase you have no values less than 10.
Additionally, you haven't specified if its less than or equal to 10, so you should modify the code to consider the =10 situation.
This should work:
data have;
input Name $ Score;
cards;
A 766
A 9
A 93
A 869
A 143
B 8
B 119
B 362
B 6
B 95
C 524
C 99
C 123
C 176
C 210
D 8
D 0
D 31
D 123
D 12
;
data want;
array s(5);
do i = 1 to 5;
set have;
if score < 10 then call missing(s(i));
else s(i) = score;
end;
if nmiss(of s(*)) = 1 then do;
second = min(of s(*));
do i = 1 to 5;
if s(i) = second then call missing(s(i));
end;
end;
do i = 1 to 5;
score = s(i);
output;
end;
keep name score;
run;
proc print data=want noobs; run;
PG
data have;
input Name $ Score;
cards;
A 766
A 9
A 93
A 869
A 143
B 8
B 119
B 362
B 6
B 95
C 524
C 99
C 123
C 176
C 210
D 8
D 0
D 31
D 123
D 12
;
data want;
set have;
if score<10 then score=.;
run;
proc sql;
select a.name,case when score=min then . else score end as score from want a
left join
(select name,min(score) as min from
(select name,score,sum(score=.) as miss_count from want
group by name)
where miss_count=1 and score ne .
group by name) b
on a.name=b.name;
quit;
How about :
data have; input Name $ Score; cards; A 766 A 9 A 93 A 869 A 143 B 8 B 119 B 362 B 6 B 95 C 524 C 99 C 123 C 176 C 210 D 8 D 0 D 31 D 123 D 12 ; run; data want(drop= m n second); n=0;m=0; array x{999} _temporary_; do until(last.name); set have ; by name; n+1;x{n}=score; if score lt 10 then m+1; end; second=smallest(2,of x{*}); call missing(of x{*}); do until(last.name); set have ; by name; if score lt 10 then call missing(score); else if m eq 1 and score eq second then call missing(score); output; end; run;
Xia Keshan
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.