Hello
I currently have a very large data set, which contains multiple records per person. I need to keep these records separate, as each row contains different information. Each person has an ID number which is duplicated for each of their records.
One variable within this data set contains a number of psychological conditions that are coded as three letters. I want to delete all people who don't have a particular psychological condition in one of their ten records. Here's and example of my data:
ID_NUMBER PSYCH_COND
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
So say I wanted to just keep the people (or ID numbers) who have ANX and delete the other people who didn't record ANX in any of their PSYCH_COND columns.
Does anybody have any ideas as to how I would go about this?
Thanks in advance!
If you want to keep all the records for anyone that has an ANX this will work:
data have;
input ID_NUMBER PSYCH_COND$;
cards;
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
;run;
data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;
drop keep;
run;
you can use if statement to delete those PSYCH_COND ~= "ANX"
data want;
set have;
if PSYCH_COND ~= "ANX" then delete;
run;
or use where statement to subset the data set (proc SQL, or data step)
data want;
set have;
where PSYCH_COND = "ANX";
run;
or
proc sql;
create table want as
select *
from have
where PSYCH_COND = "ANX";
quit;
Proc sql;
Create table want as
Select *
From have
group by Id _ number
having sum(psych_cond = 'ANX') > 0
;
Quit;
If you want to keep all the records for anyone that has an ANX this will work:
data have;
input ID_NUMBER PSYCH_COND$;
cards;
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
;run;
data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;
drop keep;
run;
data psych;
set psych;
where psych_cond="ANX";
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.