DATA Step, Macro, Functions and more

how to remove the data which satisfy some critical conditions

Accepted Solution Solved
Reply
Contributor
Posts: 25
Accepted Solution

how to remove the data which satisfy some critical conditions

Hi SAS experts,

I have a data in which there are many students and severals exams. If one student does not anwer all the questions, how do I eliminate him? For example:

data a;
input id$ answerkey$ response$ responsestatus$;
cards;
1 A B answered
1 C C answered
1 D D answered
1 B B answered
2 A B answered
2 C unanswered
2 D D answered
2 B B answered
3 A c answered
3 C C answered
3 D unanswered
3 B B answered
4 A C answered
4 C C answered
4 D C answered
4 B B answered
5 A unanswered
5 C unanswered
5 D unanswered
5 B unanswered
6 A B answered
6 C C answered
6 D c answered
6 B B answered
;
run;

 

How to get rid of id=5?

 

Thanks a lot!


Accepted Solutions
Solution
‎06-22-2017 10:36 AM
Contributor
Posts: 22

Re: how to remove the data which satisfy some critical conditions

First off, the data set as written won't work as intended. The missing values for 'response' cause the variables to be read in wrong. There are many ways to fix this. I just manually added spaces to align the variables and updated the input statement.

 

data a;
input id$ answerkey$ response$ 5 responsestatus$ 7-16;
cards;
1 A B answered
1 C C answered
1 D D answered
1 B B answered
2 A B answered
2 C   unanswered
2 D D answered
2 B B answered
3 A C answered
3 C C answered
3 D   unanswered
3 B B answered
4 A C answered
4 C C answered
4 D C answered
4 B B answered
5 A   unanswered
5 C   unanswered
5 D   unanswered
5 B   unanswered
6 A B answered
6 C C answered
6 D c answered
6 B B answered
;
run;

*Creates a running total for the number of answers per ID then only outputs those with 4 answered questions and not ID of 5;
data b (drop=answer_flag answer_count);
set a;
by id;
if first.id then answer_count=0;

answer_flag=(responsestatus="answered"); 
answer_count+answer_flag;

if last.id then do;
  if answer_count = 4 and id ^= "5" then output;
  end;
run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,392

Re: how to remove the data which satisfy some critical conditions

[ Edited ]

Something like:

proc sql;
  delete from A where RESPONSE_STATUS="unanswered";
quit;

 

Contributor
Posts: 25

Re: how to remove the data which satisfy some critical conditions

As I understand, if I use the code you provided, it will eliminate  id=3 question3 since it is unanswered? 

Super User
Super User
Posts: 7,392

Re: how to remove the data which satisfy some critical conditions

[ Edited ]

Ah, just reverse the logic then:

proc sql;
  create table TMP as select * from A where RESPONSE_STATUS="answered";
  delete from A where ID not in (select ID from TMP);
quit;

Ie if there is not an answered for that ID delete it. 

New Contributor
Posts: 3

Re: how to remove the data which satisfy some critical conditions

[ Edited ]

This depends somewhat on how your data are formatted.  If the "answered/unanswered" variables are stored as 0/1, it could be just as  easy as summing the responses and using a WHERE condition:

 

TotalResp = A + B + C + D + E

... 

WHERE TotalResp = 5

Otherwise, you could compute the number of responses missing this way using the missing function. Then, exclude those for which all are blank:

Total_Blank_Resp = missing( A ) + missing( B ) + missing( C ) + missing( D ) + missing( E )

...

WHERE Total_Blank_Resp = 5

 

Super User
Posts: 9,671

Re: how to remove the data which satisfy some critical conditions

data a;
input id$ answerkey$ response$ responsestatus : $20.;
cards;
1 A B answered
1 C C answered
1 D D answered
1 B B answered
2 A B answered
2 C . unanswered
2 D D answered
2 B B answered
3 A c answered
3 C C answered
3 D . unanswered
3 B B answered
4 A C answered
4 C C answered
4 D C answered
4 B B answered
5 A . unanswered
5 C . unanswered
5 D . unanswered
5 B . unanswered
6 A B answered
6 C C answered
6 D c answered
6 B B answered
;
run;
proc sql;
select *
 from a 
  group by id 
   having sum(responsestatus='unanswered') ne count(*);
quit;
Contributor
Posts: 25

Re: how to remove the data which satisfy some critical conditions

Ksharp,

It works! Thank you very much!

Solution
‎06-22-2017 10:36 AM
Contributor
Posts: 22

Re: how to remove the data which satisfy some critical conditions

First off, the data set as written won't work as intended. The missing values for 'response' cause the variables to be read in wrong. There are many ways to fix this. I just manually added spaces to align the variables and updated the input statement.

 

data a;
input id$ answerkey$ response$ 5 responsestatus$ 7-16;
cards;
1 A B answered
1 C C answered
1 D D answered
1 B B answered
2 A B answered
2 C   unanswered
2 D D answered
2 B B answered
3 A C answered
3 C C answered
3 D   unanswered
3 B B answered
4 A C answered
4 C C answered
4 D C answered
4 B B answered
5 A   unanswered
5 C   unanswered
5 D   unanswered
5 B   unanswered
6 A B answered
6 C C answered
6 D c answered
6 B B answered
;
run;

*Creates a running total for the number of answers per ID then only outputs those with 4 answered questions and not ID of 5;
data b (drop=answer_flag answer_count);
set a;
by id;
if first.id then answer_count=0;

answer_flag=(responsestatus="answered"); 
answer_count+answer_flag;

if last.id then do;
  if answer_count = 4 and id ^= "5" then output;
  end;
run;
Contributor
Posts: 25

Re: how to remove the data which satisfy some critical conditions

Thank all of you for the help!
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 124 views
  • 2 likes
  • 5 in conversation