BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ijones
Calcite | Level 5

Hello

 

I currently have a very large data set, which contains multiple records per person.  I need to keep these records separate, as each row contains different information.  Each person has an ID number which is duplicated for each of their records.

 

One variable within this data set contains a number of psychological conditions that are coded as three letters.  I want to delete all people who don't have a particular psychological condition in one of their ten records.  Here's and example of my data:

 

ID_NUMBER  PSYCH_COND

1          ANX

1          MDD

1          SKA

2          ANX

2          SKA

2          GAD

3          MDD

3          BPD

3          DDP

4          ANX

4          SKA

4          BPD

 

So say I wanted to just keep the people (or ID numbers) who have ANX and delete the other people who didn't record ANX in any of their PSYCH_COND columns.

 

Does anybody have any ideas as to how I would go about this?

 

Thanks in advance!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Steelers_In_DC
Barite | Level 11

If you want to keep all the records for anyone that has an ANX this will work:

 

data have;
input ID_NUMBER  PSYCH_COND$;
cards;
1          ANX
1          MDD
1          SKA
2          ANX
2          SKA
2          GAD
3          MDD
3          BPD
3          DDP
4          ANX
4          SKA
4          BPD
;run;

data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;

drop keep;
run;

View solution in original post

5 REPLIES 5
fengyuwuzu
Pyrite | Level 9

you can use if statement to delete those PSYCH_COND ~= "ANX"

data want;
set have;
if PSYCH_COND ~= "ANX" then delete;
run;

 

or use where statement to subset the data set (proc SQL, or data step)

data want;
set have;
where PSYCH_COND = "ANX";
run;

or

proc sql;
create table want as
select * 
    from have
        where PSYCH_COND = "ANX";
quit;
LinusH
Tourmaline | Level 20
I think an SQL with a sub query would so the job:
Proc sql;
Create table want as
Select *
From have
Where Id _ number in (select distinct id_number from have where psych_cond = 'ANX')
;
Quit;
Data never sleeps
Ksharp
Super User

Proc sql;
Create table want as
Select *
From have
group by Id _ number

 having sum(psych_cond = 'ANX') > 0
;
Quit;

Steelers_In_DC
Barite | Level 11

If you want to keep all the records for anyone that has an ANX this will work:

 

data have;
input ID_NUMBER  PSYCH_COND$;
cards;
1          ANX
1          MDD
1          SKA
2          ANX
2          SKA
2          GAD
3          MDD
3          BPD
3          DDP
4          ANX
4          SKA
4          BPD
;run;

data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;

drop keep;
run;

lxn1021
Obsidian | Level 7

data psych;
     set psych;
     where psych_cond="ANX";
run;

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1080 views
  • 2 likes
  • 6 in conversation