I have a dataset with 12,000 some variables and I need to collapse them so that there are no repeats with ID number and the symptoms are described on one line. The data I am working with looks like this:
And I am trying to get it to look like this:
The data is from excel so I imported it and have come up with the following code:
proc sort data= project3;
by id_no;
run;
*need array to identify symptoms
*symptom 1=heartburn, symptom 2=sickness, symptom 3=spasm, symptom 4=temperature, symptom 5=tiredness
*if/then statement
*dont want to keep symptom_no and symptom, instead make new variable;
data want;
array symptoms[5] symptom_no1-symptom_no5; *I named the arrays symptom instead of sympt to create the new variables;
retain symptom_no1-symptom_no5;
set project3;
by id_no;
if first.id_no then do i=1 to 5; *this allowed me to not have any duplicates for the ID_no;
symptoms[i]=.;
end;
if last.id_no then output;
keep id_no symptom_no1-symptom_no5;
run;
proc print data=want;
var id_no symptom_no1-symptom_no5;
run;
Unfortunately, when I run this, nothing is populated for symptoms and I end up getting this:
I understand that I have to define the symptoms such that when the system reads it knows symptom 1 is heartburns, 2 is sickness, 3=spasm, 4=temperature, 5=tiredness. I'm guessing this should go prior to the symptom array I've already written, but am having a hard time. Could I get some direction/advice please?
Thank you!
Your class mate has already asked this question. Very similar subject line so your search should be easy. It has a fully worked solution.
Thank you. I was able to find it, I think.
She asked three questions related to it I believe.
You can find her posts by clicking on her name.
What are the other 11,997 variables?
Your example input only shows 3 variables. And two of those are showing the same information in different ways. When SYMPTOM_NO=1 then SYMPTOM is always "Heartburns".
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.