BookmarkSubscribeRSS Feed
savanahb
Obsidian | Level 7

I have a dataset with 12,000 some variables and I need to collapse them so that there are no repeats with ID number and the symptoms are described on one line. The data I am working with looks like this: Screen Shot 2017-11-25 at 1.18.55 PM.png

And I am trying to get it to look like this:

Screen Shot 2017-11-25 at 1.21.29 PM.png

 The data is from excel so I imported it and have come up with the following code:

 

proc sort data= project3;
	by id_no;
run;

*need array to identify symptoms
*symptom 1=heartburn, symptom 2=sickness, symptom 3=spasm, symptom 4=temperature, symptom 5=tiredness
*if/then statement
*dont want to keep symptom_no and symptom, instead make new variable;
data want; array symptoms[5] symptom_no1-symptom_no5; *I named the arrays symptom instead of sympt to create the new variables; retain symptom_no1-symptom_no5; set project3; by id_no; if first.id_no then do i=1 to 5; *this allowed me to not have any duplicates for the ID_no; symptoms[i]=.; end; if last.id_no then output; keep id_no symptom_no1-symptom_no5; run; proc print data=want; var id_no symptom_no1-symptom_no5; run;

Unfortunately, when I run this, nothing is populated for symptoms and I end up getting this:

Screen Shot 2017-11-25 at 1.29.16 PM.png

 

I understand that I have to define the symptoms such that when the system reads it knows symptom 1 is heartburns, 2 is sickness, 3=spasm, 4=temperature, 5=tiredness. I'm guessing this should go prior to the symptom array I've already written, but am having a hard time. Could I get some direction/advice please?

Thank you!

 

 
 
4 REPLIES 4
Reeza
Super User

Your class mate has already asked this question. Very similar subject line so your search should be easy. It has a fully worked solution. 

Reeza
Super User

She asked three questions related to it I believe. 

You can find her posts by clicking on her name.

Tom
Super User Tom
Super User

What are the other 11,997 variables?

 

Your example input only shows 3 variables. And two of those are showing the same information in different ways.  When SYMPTOM_NO=1 then SYMPTOM is always "Heartburns".

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 905 views
  • 0 likes
  • 3 in conversation