BookmarkSubscribeRSS Feed
savanahb
Obsidian | Level 7

I have a dataset with 12,000 some variables and I need to collapse them so that there are no repeats with ID number and the symptoms are described on one line. The data I am working with looks like this: Screen Shot 2017-11-25 at 1.18.55 PM.png

And I am trying to get it to look like this:

Screen Shot 2017-11-25 at 1.21.29 PM.png

 The data is from excel so I imported it and have come up with the following code:

 

proc sort data= project3;
	by id_no;
run;

*need array to identify symptoms
*symptom 1=heartburn, symptom 2=sickness, symptom 3=spasm, symptom 4=temperature, symptom 5=tiredness
*if/then statement
*dont want to keep symptom_no and symptom, instead make new variable;
data want; array symptoms[5] symptom_no1-symptom_no5; *I named the arrays symptom instead of sympt to create the new variables; retain symptom_no1-symptom_no5; set project3; by id_no; if first.id_no then do i=1 to 5; *this allowed me to not have any duplicates for the ID_no; symptoms[i]=.; end; if last.id_no then output; keep id_no symptom_no1-symptom_no5; run; proc print data=want; var id_no symptom_no1-symptom_no5; run;

Unfortunately, when I run this, nothing is populated for symptoms and I end up getting this:

Screen Shot 2017-11-25 at 1.29.16 PM.png

 

I understand that I have to define the symptoms such that when the system reads it knows symptom 1 is heartburns, 2 is sickness, 3=spasm, 4=temperature, 5=tiredness. I'm guessing this should go prior to the symptom array I've already written, but am having a hard time. Could I get some direction/advice please?

Thank you!

 

 
 
4 REPLIES 4
Reeza
Super User

Your class mate has already asked this question. Very similar subject line so your search should be easy. It has a fully worked solution. 

Reeza
Super User

She asked three questions related to it I believe. 

You can find her posts by clicking on her name.

Tom
Super User Tom
Super User

What are the other 11,997 variables?

 

Your example input only shows 3 variables. And two of those are showing the same information in different ways.  When SYMPTOM_NO=1 then SYMPTOM is always "Heartburns".

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 968 views
  • 0 likes
  • 3 in conversation