BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
pink_poodle
Barite | Level 11

I have a table with three variables: patient, time, drug. Let us say there are two patients. First patient received drug A at time 1 followed by drug B at time 2. Second patient received drug B at time 1 followed by drug A at time 2. I want to accumulate a substring for the group variable, such that patient 1 belongs to group "AB" and patient 2  belong to group "BA". Any suggestions are welcome!

data want;
	input patient time drug $ group $;
	datalines;
		1 1 A AB
		1 2 B AB
		2 1 B BA
		2 2 A BA
	;
run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
/*Assuming there are no duplicated obs in your data*/
data have;
	input patient time drug $;
	datalines;
		1 1 A 
		1 2 B 
		2 1 B 
		2 2 A 
	;
run;

data want;
do until(last.patient);
 set have;
 by patient;
 length group $ 10;
 group=cats(group,drug);
end;
do until(last.patient);
 set have;
 by patient;
 output;
end;
run;

View solution in original post

9 REPLIES 9
rvikram
Fluorite | Level 6
What is your question?
pink_poodle
Barite | Level 11
My question is how to accumulate a substring for the group variable, such that patient 1 belongs to group "AB" and patient 2 belong to group "BA" ?
rvikram
Fluorite | Level 6
If I were you, I’d check the syntax for data lines again.

You need infile statement, and preferably a DLM keyword. Please check again.
pink_poodle
Barite | Level 11
@rvikram, it should work, please give it a try :).
rvikram
Fluorite | Level 6
If you are looking to group certain variables, you can write a proc sql step with group by.

If you don’t want duplicate values, you can write a proc sort step with nodup, or select distinct within proc sql.
Ksharp
Super User
/*Assuming there are no duplicated obs in your data*/
data have;
	input patient time drug $;
	datalines;
		1 1 A 
		1 2 B 
		2 1 B 
		2 2 A 
	;
run;

data want;
do until(last.patient);
 set have;
 by patient;
 length group $ 10;
 group=cats(group,drug);
end;
do until(last.patient);
 set have;
 by patient;
 output;
end;
run;
pink_poodle
Barite | Level 11

@Ksharp , thank you, this is very helpful! I would like to have "ABB" group for duplicates, could you please advice?:

data want;
	input patient time drug $ group $;
	datalines;
		1 1 A ABB
		1 2 B ABB
		1 3 B ABB
		2 1 B BA
		2 2 A BA
	;
run;
Ksharp
Super User
Just try my original code. No need any change.
mkeintz
PROC Star

Pass through each ID twice, once to build GROUP, once to output it:

 

data have;
  input patient time drug $;
  datalines;
1 1 A 
1 2 B 
2 1 B 
2 2 A 
run;

data want;
  set have (in=firstpass)
      have (in=secondpass);
  by id;
  retain group '          ' ;  /*As many blanks as needed for group */
  if first.id then group=drug;
  else if firstpass then group=cats(group,drug);
  if secondpass;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1590 views
  • 5 likes
  • 4 in conversation