Hello
I currently have a very large data set, which contains multiple records per person. I need to keep these records separate, as each row contains different information. Each person has an ID number which is duplicated for each of their records.
One variable within this data set contains a number of psychological conditions that are coded as three letters. I want to delete all people who don't have a particular psychological condition in one of their ten records. Here's and example of my data:
ID_NUMBER PSYCH_COND
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
So say I wanted to just keep the people (or ID numbers) who have ANX and delete the other people who didn't record ANX in any of their PSYCH_COND columns.
Does anybody have any ideas as to how I would go about this?
Thanks in advance!
If you want to keep all the records for anyone that has an ANX this will work:
data have;
input ID_NUMBER PSYCH_COND$;
cards;
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
;run;
data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;
drop keep;
run;
you can use if statement to delete those PSYCH_COND ~= "ANX"
data want;
set have;
if PSYCH_COND ~= "ANX" then delete;
run;
or use where statement to subset the data set (proc SQL, or data step)
data want;
set have;
where PSYCH_COND = "ANX";
run;
or
proc sql;
create table want as
select *
from have
where PSYCH_COND = "ANX";
quit;
Proc sql;
Create table want as
Select *
From have
group by Id _ number
having sum(psych_cond = 'ANX') > 0
;
Quit;
If you want to keep all the records for anyone that has an ANX this will work:
data have;
input ID_NUMBER PSYCH_COND$;
cards;
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
;run;
data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;
drop keep;
run;
data psych;
set psych;
where psych_cond="ANX";
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.