Hello
I currently have a very large data set, which contains multiple records per person. I need to keep these records separate, as each row contains different information. Each person has an ID number which is duplicated for each of their records.
One variable within this data set contains a number of psychological conditions that are coded as three letters. I want to delete all people who don't have a particular psychological condition in one of their ten records. Here's and example of my data:
ID_NUMBER PSYCH_COND
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
So say I wanted to just keep the people (or ID numbers) who have ANX and delete the other people who didn't record ANX in any of their PSYCH_COND columns.
Does anybody have any ideas as to how I would go about this?
Thanks in advance!
If you want to keep all the records for anyone that has an ANX this will work:
data have;
input ID_NUMBER PSYCH_COND$;
cards;
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
;run;
data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;
drop keep;
run;
you can use if statement to delete those PSYCH_COND ~= "ANX"
data want;
set have;
if PSYCH_COND ~= "ANX" then delete;
run;
or use where statement to subset the data set (proc SQL, or data step)
data want;
set have;
where PSYCH_COND = "ANX";
run;
or
proc sql;
create table want as
select *
from have
where PSYCH_COND = "ANX";
quit;
Proc sql;
Create table want as
Select *
From have
group by Id _ number
having sum(psych_cond = 'ANX') > 0
;
Quit;
If you want to keep all the records for anyone that has an ANX this will work:
data have;
input ID_NUMBER PSYCH_COND$;
cards;
1 ANX
1 MDD
1 SKA
2 ANX
2 SKA
2 GAD
3 MDD
3 BPD
3 DDP
4 ANX
4 SKA
4 BPD
;run;
data want;
do until(last.id_number);
set have;
by id_number;
if psych_cond = 'ANX' then keep = 1;
end;
do until(last.id_number);
set have;
by id_number;
if keep = 1 then output;
end;
drop keep;
run;
data psych;
set psych;
where psych_cond="ANX";
run;
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.