BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
cheroij
Fluorite | Level 6

Hello,

 

I am trying to parse out the drug name, strength, and unit variable for specific medications only (Amantadine, Carbidopa, Entacapone, Apomorphine, Trihexyphenidyl, Amantadine, Trihexyphenidyl, and levodopa). The final aim is to calculate the daily dose of these medications. I have tried using the scan, compress followed by the  coalescec option to no avail.

1) Things to note a) Use of different delimeters "-" and "/" for the strength variable, (b) different position of drug of interest : position1/position2/position3.

 

I have included my data and table desired.

 

Thank you for your assistance.

 

 

 


data WORK.HAVE;
infile datalines dsd truncover;
input Patient_ID:32. Drug_Name:$37. Strength:$12.;
label Patient_ID="Patient ID" Drug_Name="Drug_Name" Strength="Strength";
datalines;
1 Carbidopa 25 MG
1 Carbidopa/Levodopa 25-100 MG
2 Levodopa/Entacapone 100-200 MG
2 Apomorphine 2 MG/ML
2 Ropinirole/Entacapone 2-200 MG
3 Levodopa/Entacapone/Ropinirole 100-200-2 MG
3 Pramipexole/Entacapone/Amantadine 1-200-100 MG
3 Pramipexole/Trihexyphenidyl 1-2 MG
4 Amantadine/Benztropine 100-1 MG
5 Trihexyphenidyl/Rasagiline/Entacapone 2-0.5-200 MG
;;;;

 

 

Want

Patient IDDrug_of_InterestStrength1Unit1Drug of interest 2Strength2Unit2
1Amantadine100MG ..
1Carbidopa25MGLevodopa100MG
2Entacapone200MG ..
2Apomorphine2MG/ML ..
2Entacapone200MG ..
3Entacapone200MG   
3Entacapone200MGAmantadine100MG
3Trihexyphenidyl2MG...
4Amantadine100MG...
5Trihexyphenidyl2MGEntacapone200MG
1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data WORK.HAVE;
infile datalines  truncover;
input Patient_ID:32. Drug_Name:$37. Strength $12.;
label Patient_ID="Patient ID" Drug_Name="Drug_Name" Strength="Strength";
datalines;
1 Carbidopa 25 MG
1 Carbidopa/Levodopa 25-100 MG
2 Levodopa/Entacapone 100-200 MG
2 Apomorphine 2 MG/ML
2 Ropinirole/Entacapone 2-200 MG
3 Levodopa/Entacapone/Ropinirole 100-200-2 MG
3 Pramipexole/Entacapone/Amantadine 1-200-100 MG
3 Pramipexole/Trihexyphenidyl 1-2 MG
4 Amantadine/Benztropine 100-1 MG
5 Trihexyphenidyl/Rasagiline/Entacapone 2-0.5-200 MG
;
data temp;
 set have;
 n+1;
 do i=1 to countw(Drug_Name,'/');
  Drug_of_Interest_=scan(Drug_Name,i,'/');
  Strength_=scan(Strength,i,'-','s');
  Unit_=scan(Strength,-1,,'s');
  if prxmatch('/Amantadine|Carbidopa|Entacapone|Apomorphine|Trihexyphenidyl|levodopa/i',Drug_of_Interest_) then output;
 end;
 keep n  Patient_ID Drug_of_Interest_ Strength_  Unit_;
 run;
 data temp2;
  set temp;
  by n;
  if first.n then id=0;
  id+1;
run;
 proc transpose data=temp2 out=temp3;
 by n id Patient_ID;
 var Drug_of_Interest_ Strength_  Unit_;
 run;
 proc transpose data=temp3 out=want(drop=_name_);
 by n Patient_ID;
 var col1;
 id _name_ id;
 run;

Ksharp_0-1735524836178.png

 

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

This parses the drug names (assuming a max length of 24 characters and that the separator is always a slash).

 

data want;
	set have;
	array drug $ 24 drug1-drug3;
	do i=1 to countw(drug_name,'/');
		drug(i)=scan(drug_name,i,'/');
	end;
	drop i drug_name;
run;

 

I leave it up to you to follow the above code and modify it to work on dosages.

--
Paige Miller
PaigeMiller
Diamond | Level 26

Correction to the above code

 

data want;
	set have;
	array drug $ 24 drug1-drug3;
	count=0;
	do i=1 to countw(drug_name,'/');
		this_drug=scan(drug_name,i,'/');
		if upcase(this_drug) in 
			('AMANTADINE', 'CARBIDOPA', 'ENTACAPONE', 'APOMORPHINE', 'TRIHEXYPHENIDYL', 'AMANTADINE', 'TRIHEXYPHENIDYL', 'LEVODOPA') then do;
		    count=count+1;
		    drug(count)=this_drug;
	    end;
	end;
	drop i drug_name count this_drug;
run;
--
Paige Miller
cheroij
Fluorite | Level 6

Thank you for your response and your time. This code did not parse out the strength variaable.

Tom
Super User Tom
Super User

I would start by converting the dataset into one observation per drug.

 

First lets fix your example data step so it can run.  I will rename the variables to indicate the are LISTS and not normal values.  It is not clear to me what you are doing with the UNITS, so since it looks like a separate variable in your data lines I will make it a separate variable (even though it does not seem to have enough values).

data HAVE;
  input Patient_ID drugs :$60. strengths :$20. Units :$20. ;
datalines;
1 Carbidopa 25 MG
1 Carbidopa/Levodopa 25-100 MG
2 Levodopa/Entacapone 100-200 MG
2 Apomorphine 2 MG/ML
2 Ropinirole/Entacapone 2-200 MG
3 Levodopa/Entacapone/Ropinirole 100-200-2 MG
3 Pramipexole/Entacapone/Amantadine 1-200-100 MG
3 Pramipexole/Trihexyphenidyl 1-2 MG
4 Amantadine/Benztropine 100-1 MG
5 Trihexyphenidyl/Rasagiline/Entacapone 2-0.5-200 MG
;;;;

Now just loop over the paired lists and parse out the DRUG_NAME and STRENGTH from the list.

data want;
  set have;
  do index=1 to countw(drugs,'/');
    length drug_name $30 strength $10 unit $10 ;
    drug_name=scan(drugs,index,'/');
    strength=scan(strengths,index,'-');
    unit=scan(units,min(index,countw(units,'-')),'-');
    output;
  end;
run;

Now your test for whether the drug is a drug of interest is simpler since DRUG_NAME now only has one value per observation.

Tom_0-1735491649215.png

 

cheroij
Fluorite | Level 6

Thank you for your response. I had to make some teaks to the code. I apologize for my sample data not being clear.  The strength variable included the unit as well (ie unit is not a separate column in my data)

Tom
Super User Tom
Super User

@cheroij wrote:

... The strength variable included the unit as well (ie unit is not a separate column in my data)


You might need to revisit that decision.  The units for all drugs taken by a patient are not always going to be the same.  And even if your drug lists just represent single pill combination drugs then not all of those will have the strengths listed using the same units.

Kurt_Bremser
Super User

Don't saddle yourself with the burden of a wide dataset (Maxim 19).

data have;
infile datalines dsd dlm=" " truncover;
length patient_id 8 drugs $40 strengths $12 unit $10;
input patient_id drugs strengths unit;
datalines;
1 Carbidopa 25 MG
1 Carbidopa/Levodopa 25-100 MG
2 Levodopa/Entacapone 100-200 MG
2 Apomorphine 2 MG/ML
2 Ropinirole/Entacapone 2-200 MG
3 Levodopa/Entacapone/Ropinirole 100-200-2 MG
3 Pramipexole/Entacapone/Amantadine 1-200-100 MG
3 Pramipexole/Trihexyphenidyl 1-2 MG
4 Amantadine/Benztropine 100-1 MG
5 Trihexyphenidyl/Rasagiline/Entacapone 2-0.5-200 MG
;

data interest;
length drug $20;
input drug;
drug = upcase(drug);
datalines;
Amantadine
Carbidopa
Entacapone
Apomorphine
Trihexyphenidyl
Amantadine
Trihexyphenidyl
levodopa
;

data want;
retain patient_id n n1 drug strength unit; * just for variable order;
set have;
length drug $20 strength 8 n 8 n1 8;
by patient_id;
if _n_ = 1
then do;
  declare hash int (dataset:"interest");
  int.definekey("drug");
  int.definedone();
end;
if first.patient_id
then n = 1;
else n + 1;
n1 = 0;
do i = 1 to countw(drugs,"/");
  drug = scan(drugs,i,"/");
  strength = input(scan(strengths,i,"-"),32.);
  if int.check(key:upcase(drug)) = 0
  then do;
    n1 + 1;
    output;
  end;
end;
keep patient_id drug strength unit n n1;
run;

proc report data=want;
column patient_id n n1,(drug unit strength);
define patient_id / group;
define n / group noprint;
define n1 / "" across;
define drug / display;
define unit / display;
define strength / analysis;
run;

I created additional variables to get a report like you showed.

 

cheroij
Fluorite | Level 6

Thank you for your response. Your code worked with minor tweaks. Sorry about the confusion in data sample provided.The nint column is not a separate column in my dataset.

Ksharp
Super User
data WORK.HAVE;
infile datalines  truncover;
input Patient_ID:32. Drug_Name:$37. Strength $12.;
label Patient_ID="Patient ID" Drug_Name="Drug_Name" Strength="Strength";
datalines;
1 Carbidopa 25 MG
1 Carbidopa/Levodopa 25-100 MG
2 Levodopa/Entacapone 100-200 MG
2 Apomorphine 2 MG/ML
2 Ropinirole/Entacapone 2-200 MG
3 Levodopa/Entacapone/Ropinirole 100-200-2 MG
3 Pramipexole/Entacapone/Amantadine 1-200-100 MG
3 Pramipexole/Trihexyphenidyl 1-2 MG
4 Amantadine/Benztropine 100-1 MG
5 Trihexyphenidyl/Rasagiline/Entacapone 2-0.5-200 MG
;
data temp;
 set have;
 n+1;
 do i=1 to countw(Drug_Name,'/');
  Drug_of_Interest_=scan(Drug_Name,i,'/');
  Strength_=scan(Strength,i,'-','s');
  Unit_=scan(Strength,-1,,'s');
  if prxmatch('/Amantadine|Carbidopa|Entacapone|Apomorphine|Trihexyphenidyl|levodopa/i',Drug_of_Interest_) then output;
 end;
 keep n  Patient_ID Drug_of_Interest_ Strength_  Unit_;
 run;
 data temp2;
  set temp;
  by n;
  if first.n then id=0;
  id+1;
run;
 proc transpose data=temp2 out=temp3;
 by n id Patient_ID;
 var Drug_of_Interest_ Strength_  Unit_;
 run;
 proc transpose data=temp3 out=want(drop=_name_);
 by n Patient_ID;
 var col1;
 id _name_ id;
 run;

Ksharp_0-1735524836178.png

 

cheroij
Fluorite | Level 6

Thank you😊

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1697 views
  • 5 likes
  • 5 in conversation