BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
pink_poodle
Barite | Level 11

I have a table with three variables: patient, time, drug. Let us say there are two patients. First patient received drug A at time 1 followed by drug B at time 2. Second patient received drug B at time 1 followed by drug A at time 2. I want to accumulate a substring for the group variable, such that patient 1 belongs to group "AB" and patient 2  belong to group "BA". Any suggestions are welcome!

data want;
	input patient time drug $ group $;
	datalines;
		1 1 A AB
		1 2 B AB
		2 1 B BA
		2 2 A BA
	;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
/*Assuming there are no duplicated obs in your data*/
data have;
	input patient time drug $;
	datalines;
		1 1 A 
		1 2 B 
		2 1 B 
		2 2 A 
	;
run;

data want;
do until(last.patient);
 set have;
 by patient;
 length group $ 10;
 group=cats(group,drug);
end;
do until(last.patient);
 set have;
 by patient;
 output;
end;
run;

View solution in original post

9 REPLIES 9
rvikram
Fluorite | Level 6
What is your question?
pink_poodle
Barite | Level 11
My question is how to accumulate a substring for the group variable, such that patient 1 belongs to group "AB" and patient 2 belong to group "BA" ?
rvikram
Fluorite | Level 6
If I were you, I’d check the syntax for data lines again.

You need infile statement, and preferably a DLM keyword. Please check again.
pink_poodle
Barite | Level 11
@rvikram, it should work, please give it a try :).
rvikram
Fluorite | Level 6
If you are looking to group certain variables, you can write a proc sql step with group by.

If you don’t want duplicate values, you can write a proc sort step with nodup, or select distinct within proc sql.
Ksharp
Super User
/*Assuming there are no duplicated obs in your data*/
data have;
	input patient time drug $;
	datalines;
		1 1 A 
		1 2 B 
		2 1 B 
		2 2 A 
	;
run;

data want;
do until(last.patient);
 set have;
 by patient;
 length group $ 10;
 group=cats(group,drug);
end;
do until(last.patient);
 set have;
 by patient;
 output;
end;
run;
pink_poodle
Barite | Level 11

@Ksharp , thank you, this is very helpful! I would like to have "ABB" group for duplicates, could you please advice?:

data want;
	input patient time drug $ group $;
	datalines;
		1 1 A ABB
		1 2 B ABB
		1 3 B ABB
		2 1 B BA
		2 2 A BA
	;
run;
Ksharp
Super User
Just try my original code. No need any change.
mkeintz
PROC Star

Pass through each ID twice, once to build GROUP, once to output it:

 

data have;
  input patient time drug $;
  datalines;
1 1 A 
1 2 B 
2 1 B 
2 2 A 
run;

data want;
  set have (in=firstpass)
      have (in=secondpass);
  by id;
  retain group '          ' ;  /*As many blanks as needed for group */
  if first.id then group=drug;
  else if firstpass then group=cats(group,drug);
  if secondpass;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1545 views
  • 5 likes
  • 4 in conversation