BookmarkSubscribeRSS Feed
lexsc_sas
Calcite | Level 5

I am trying to separate out a column as it currently has multiple effects listed in the individual column like so:

lexsc_sas_0-1696967754696.png

I use the scan function and create different column names for the various symptoms.

lexsc_sas_1-1696967977757.png

 

From there, I would like to arrange them alphabetically but have the missing values be last (i.e. have the Symptom1 column be acidosis for all those who have that symptom) so that I can use the proc freq function to determine how many people total had each individual symptom. 

I tried using call sortc but this had all the missing values first (1 of the clinical effects had 23 different symptoms, but many only had 1 or 2 listed) and then would start the symptoms in alphabetical order so that the "acidosis" was in different numbers instead of always being symptom1 like I would like it to be. 

lexsc_sas_2-1696968057357.png

 

Any suggestions? 

3 REPLIES 3
Reeza
Super User
Revisit your data structure.
Instead, put each symptom on a line by itself so you can summarize it. If that's not possible, I'd still do that, create a 0/1 variable and then transpose it so that each column is an indicator variable for the presence of a specific symptom.
A_Kh
Lapis Lazuli | Level 10

Try to first separate distinct values into a single column, then sort it and transpose. 

eg;

data have;
infile cards truncover;
input @1 id  @3 symptoms $ 3-100;
cards;
1 Dizziness/vertigo (NR),Vomitting (NR),Confusion (R)
2 Headache (R),Muscle weakness (R),Acidosis
3 Confusion (R)
4 Vomitting (NR),Headache (R)
; 
proc print;run; 

data have1;
	set have;
	num= countc(symptoms, ',');
	do i= 1 to num+1;
		Symptom= strip(scan(symptoms, i, ','));
		output;
	end;
	drop num i;
run; 
proc print; run; 

proc sort data= have1; 
	by id symptom; 
proc print; run; 

proc transpose data=have1 out=want (drop=_name_) prefix=Symptom;
	by id;
	var symptom;
proc print;run; 
mkeintz
PROC Star

You have a character variable containing a comma-separated list.  You want to extract and alphabetize the list elements.  You can do this in a single data step. 

 

Say no list is expected to have more than 25 members, then you can manage with a 25-element arrays of character variables:

 

data have;
  infile cards truncover;
  input @1 id  @3 symptoms $ 3-100;
cards;
1 Dizziness/vertigo (NR),Vomitting (NR),Confusion (R)
2 Headache (R),Muscle weakness (R),Acidosis
3 Confusion (R)
4 Vomitting (NR),Headache (R)
run; 

data want (drop=_s _t);
  set have;
  array symptom {25} $30 ;

  do _s=1 to dim(symptom) until (_s=countw(symptoms,','));
    symptom{_s}=strip(scan(symptoms,_s, ','));
  end;
  call sortc(of symptom{*});

  length _alphabetized_symptoms $800;   *For temporary sorted CSV list of symptoms *;
  _alphabetized_symptoms=catx(',',of symptom{*});

  call missing(of symptom{*});
  do _t=1 to _s;
    symptom{_t}=scan(_alphabetized_symptoms,_t,',');
  end;
run;

 

A possible alternative to above, which shifts the sorted values to the left, is to add elements that will always sort higher than actual data, namely assign unneeded elements with hexadecimal 'FF' values.  After sorting reset them to blank.

 

data want (drop=_s _t);
  set have;
  array symptom {25} $30 ;

  do _s=1 to dim(symptom) until (_s=countw(symptoms,','));
    symptom{_s}=strip(scan(symptoms,_s, ','));
  end;
  if _s<dim(symptom) then do _t=_s+1 to dim(symptom);
    symptom{_t}='FF'x;
  end;

  call sortc(of symptom{*});

  if _s<dim(symptom) then do _t=_s+1 to dim(symptom);
    symptom{_t}=' ';
  end;
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 419 views
  • 4 likes
  • 4 in conversation