BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hjjijkkl
Pyrite | Level 9

I have a big data and I want to select specific records where the identified ID have a test result <=60 . I want to exclude if the subject "ID" has test result >60 during any time period from the analysis 

ID

date

test

1

5/21/2019

43

1

8/15/2019

12

1

10/16/2020

20

2

8/4/2019

10

2

12/2/1019

70

3

5/21/2019

7

3

8/15/2019

60

4

1/6/1019

21

4

5/23/2019

33

4

8/5/2019

57

 

I want an output with ID that only have <=60

ID

1

3

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

But if your data is already sorted by ID, a datastep should be faster:

 

data want;
do until(last.ID);
	set have; by ID;
	t = max(t, test);
	end;
if t <= 60;
keep ID;
run;
PG

View solution in original post

6 REPLIES 6
PGStats
Opal | Level 21

The simplest way might be to use SQL:

 

proc sql;
create table want as
select ID
from have
group by ID
having max(test) <= 60;
quit;
PG
PGStats
Opal | Level 21

But if your data is already sorted by ID, a datastep should be faster:

 

data want;
do until(last.ID);
	set have; by ID;
	t = max(t, test);
	end;
if t <= 60;
keep ID;
run;
PG
hjjijkkl
Pyrite | Level 9

I have a big data and I want to select specific records where the identified ID have a test result <=60 . I want to exclude if the subject "ID" has test result >60 during any time period from the analysis 

ID

date

test

1

5/21/2019

43

1

8/15/2019

12

1

10/16/2020

20

2

8/4/2019

10

2

12/2/1019

70

3

5/21/2019

7

3

8/15/2019

60

4

1/6/1019

21

4

5/23/2019

33

4

8/5/2019

57

 

I want an output with ID that only have <=60

ID

ID1

1

1

2

 

3

3

4

 

Ksharp
Super User
data have;
input ID
date : $10.
test ;
cards;
1
5/21/2019
43
1
8/15/2019
12
1
10/16/2020
20
2
8/4/2019
10
2
12/2/1019
70
3
5/21/2019
7
3
8/15/2019
60
4
1/6/1019
21
4
5/23/2019
33
4
8/5/2019
57
;

proc sql;
create table want as
select distinct id
from have
group by id
having sum(test>60)=0;
quit;
hjjijkkl
Pyrite | Level 9

How do I still keep the ID and create a new ID variable for those who have test <=60?

As in the table below,  want to keep the list of ID and have "ID1" as a new variable to represent the ones that have test <=60

I want an output to look like this

ID

ID1

1

1

2

 

3

3

4

 

Ksharp
Super User
data have;
input ID
date : $10.
test ;
cards;
1
5/21/2019
43
1
8/15/2019
12
1
10/16/2020
20
2
8/4/2019
10
2
12/2/1019
70
3
5/21/2019
7
3
8/15/2019
60
4
1/6/1019
21
4
5/23/2019
33
4
8/5/2019
57
;

proc sql;
create table want as
select a.id as id,b.id as id1 from
(select distinct id from have) as a natural left join
(select distinct id from have group by id having sum(test>60)=0) as b
;
quit;