DATA Step, Macro, Functions and more

Sorting by multiple ID types

Accepted Solution Solved
Reply
Senior User
Posts: 1
Accepted Solution

Sorting by multiple ID types

Hello

 

I currently have a very large data set, which contains multiple records per person.  I need to keep these records separate, as each row contains different information.  Each person has an ID number which is duplicated for each of their records.

 

One variable within this data set contains a number of psychological conditions that are coded as three letters.  I want to delete all people who don't have a particular psychological condition in one of their ten records.  Here's and example of my data:

 

ID_NUMBER  PSYCH_COND

1          ANX

1          MDD

1          SKA

2          ANX

2          SKA

2          GAD

3          MDD

3          BPD

3          DDP

4          ANX

4          SKA

4          BPD

 

So say I wanted to just keep the people (or ID numbers) who have ANX and delete the other people who didn't record ANX in any of their PSYCH_COND columns.

 

Does anybody have any ideas as to how I would go about this?

 

Thanks in advance!

 

 


Accepted Solutions
Solution
‎02-25-2016 04:53 PM
Valued Guide
Posts: 858

Re: Sorting by multiple ID types

[ Edited ]

If you want to keep all the records for anyone that has an ANX this will work:

 

data have;
input ID_NUMBER  PSYCH_COND$;
cards;
1          ANX
1          MDD
1          SKA
2          ANX
2          SKA
2          GAD
3          MDD
3          BPD
3          DDP
4          ANX
4          SKA
4          BPD
;run;

data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;

drop keep;
run;

View solution in original post


All Replies
Super Contributor
Posts: 312

Re: Sorting by multiple ID types

[ Edited ]

you can use if statement to delete those PSYCH_COND ~= "ANX"

data want;
set have;
if PSYCH_COND ~= "ANX" then delete;
run;

 

or use where statement to subset the data set (proc SQL, or data step)

data want;
set have;
where PSYCH_COND = "ANX";
run;

or

proc sql;
create table want as
select * 
    from have
        where PSYCH_COND = "ANX";
quit;
Super User
Posts: 5,256

Re: Sorting by multiple ID types

I think an SQL with a sub query would so the job:
Proc sql;
Create table want as
Select *
From have
Where Id _ number in (select distinct id_number from have where psych_cond = 'ANX')
;
Quit;
Data never sleeps
Super User
Posts: 9,681

Re: Sorting by multiple ID types

Proc sql;
Create table want as
Select *
From have
group by Id _ number

 having sum(psych_cond = 'ANX') > 0
;
Quit;

Solution
‎02-25-2016 04:53 PM
Valued Guide
Posts: 858

Re: Sorting by multiple ID types

[ Edited ]

If you want to keep all the records for anyone that has an ANX this will work:

 

data have;
input ID_NUMBER  PSYCH_COND$;
cards;
1          ANX
1          MDD
1          SKA
2          ANX
2          SKA
2          GAD
3          MDD
3          BPD
3          DDP
4          ANX
4          SKA
4          BPD
;run;

data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;

drop keep;
run;

Occasional Contributor
Posts: 17

Re: Sorting by multiple ID types

data psych;
     set psych;
     where psych_cond="ANX";
run;

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 264 views
  • 2 likes
  • 6 in conversation