I have a big data and I want subjects with test result <50 and to include subjects if the test result <150 one time only during their testing period. If the test result is <150 2x or more then the subject will not be need. For example, ID – 4 has 1st test <50 but 2 test results <150 so the subject will not be need. How can I get those that have test<50 or test<150 one time only?
ID |
test |
1 |
50 |
1 |
45 |
1 |
78 |
2 |
34 |
2 |
26 |
3 |
67 |
4 |
22 |
4 |
120 |
4 |
69 |
This is how the output would look like
ID |
test |
1 |
50 |
1 |
45 |
1 |
78 |
2 |
34 |
2 |
26 |
3 |
67 |
The following code may be helpful. It generates the result you want.
proc sql;
create table WANT as
select *
from HAVE
where id not in(
select unique(id)
from HAVE
where test>50
group by ID
having count(*) > 1
);
quit;
You need to read the obs for each id twice:
BTW, given that you are keeping ID 1 (values 50, 45, 78) I presume you meant <=50 (not <50), since you would other have two values that are not <50. I also changed to <=150 (not <150).
data have;
input ID test;
datalines;
1 50
1 45
1 78
2 34
2 26
3 67
4 22
4 120
4 69
run;
data want (drop=_:);
do until (last.id);
set have;
by id;
if (51<=test<=150) then _n_51to150=sum(_n_51to150,1); else
if test>150 then _n_over150=sum(_n_over150,1);
end;
do until (last.id);
set have;
by id;
if (_n_over150=.) and (_n_51to150<=1) then output;
end;
run;
@hjjijkkl wrote:
I have a big data and I want subjects with test result <50 and to include subjects if the test result <150 one time only during their testing period. If the test result is <150 2x or more then the subject will not be need. For example, ID – 4 has 1st test <50 but 2 test results <150 so the subject will not be need. How can I get those that have test<50 or test<150 one time only?
ID
test
1
50
1
45
1
78
2
34
2
26
3
67
4
22
4
120
4
69
This is how the output would look like
ID
test
1
50
1
45
1
78
2
34
2
26
3
67
Sorry, but the logic you want doesn't make any sense at all. If a value is smaller than 50 it is smaller than 150, too.
Are there more variables in the dataset or just "ID" and "test"?
The following code may be helpful. It generates the result you want.
proc sql;
create table WANT as
select *
from HAVE
where id not in(
select unique(id)
from HAVE
where test>50
group by ID
having count(*) > 1
);
quit;
This solution assumes that values over 150 are not possible.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.