BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
marysmith
Calcite | Level 5
data arzneimittelgruppen; 
set datensatz_1416_Erstanzeige;
if NIS_Nummer in (6513,6699) then allergy=1; /*N=2*/
if NIS_Nummer in (2047,6613,6623,6725,7030) then antibiotics=1; /*N=5*/
if NIS_Nummer in (3203,6622,6629,6649,6818,6819) then antidiabetic=1; /*N=6*/
if NIS_Nummer in (6653,6736,6771,6775,6837,6847) then antihypertensive=1;  /*N=5*/
if NIS_Nummer in (2407,2507,2532,3396,6635) then antipsychotic=1;  /*N=5*/
if NIS_Nummer in (2069,5266,6131,6703,6719) then antivirals=1; /*?*/ /*N=5*/
if NIS_Nummer in (2046,2289,6468,6626,6711) then biologicals=1; /*N=5*/
if NIS_Nummer in (2230,6709) then contrast=1; /*N=2*/
if NIS_Nummer in (2268,2552,2962,6657,6713,6799) then copd=1; /*N=6*/
if NIS_Nummer in (1851,1856,1969,1990,6194,6521,6784,6849,6880) then dermatology=1; /*N=9*/
if NIS_Nummer in (2087,2311,2590,3431,5080,6417,6419,6611,6618,6625,6652,6666,6673,6677,6683,6696,6720,6721,6815,6857,6858,6870,6876,6877,6878,6884) then registry=1;/*N=26*/
if NIS_Nummer in (2187,6170,6172,6752,6883) then eye=1; /*N=5*/
if NIS_Nummer in (2367,2408,2548,2568,3377,3387,6431,6474,6530,6534,6535,6536,6627,6642,6660,6661,6665,6668,6698,6759,6847,6869,6874,7038) then haematology=1; /*N=24*/
if NIS_Nummer in (1867,3432,4147,6663,6717,6755,6758,6766,6835) then hormone=1;    /*N=9*/
if NIS_Nummer in (1928,1949,2067,2075,2147,2211,2631,2632,5383,6449,6494,6532,6636,6640,6714,6772) then neurological=1; /*N=16*/
if NIS_Nummer in (2188) then nondrug=1; /*N=1*/
if NIS_Nummer in (1888,2027,2309,2314,2467,2527,2837,2980,3186,3612,4685,5451,5584,6471,6507,6637,6659,6669,6682,6694,6710,6735,6741,6773,6862,6889)then nonpres=1;  /*N=26*/
if NIS_Nummer in (1906,1908,1955,2071,2076,2090,2155,2307,2628,2630,2892,3039,3078,3806,4056,4306,5394,6173,6473,6501,6506,6509,6512,6519,6520,6523,6524,6526,6529,6531,
6612,6630,6638,6639,6646,6651,6655,6656,6658,6670,6671,6675,6685,6687,6690,6691,6716,6738,6762,6764,6765,6774,6779,6780,6788,6789,6791,6795,6804,6816,6830,6832,6841,
6868,6872,6873,6879,7036) then oncology=1;         /*N=69*/
if NIS_Nummer in (2228,2269,6609) then orphan=1; /*N=3*/
if NIS_Nummer in (2095,2529,2549,2707,5361,5508,6643,6645,6667,6679,6724,6728,6756,6801,6828,6856,6861,6887) then others=1; /*N=18*/
if NIS_Nummer in (1988,6518,6654,6740,6777,6782,6806,6831,6863,7037)then pain=1; /*N=10*/
if NIS_Nummer in (2389,2608,3598,3774,3820,6751)then urologics=1;  /*N=6*/
run;


data drugs1;
set datensatz_1416_Erstanzeige;
if NIS_Nummer=1847 then drug="Imnovid";
if NIS_Nummer=1851 then drug="Actikerall II";
if NIS_Nummer=1856 then drug="Iluvien";
if NIS_Nummer=1867 then drug="Leuprone HEXAL";
if NIS_Nummer=1888 then drug="Angocin";
if NIS_Nummer=1906 then drug="Gemedac";
if NIS_Nummer=1908 then drug="Taxceus";
if NIS_Nummer=1928 then drug="Botox";
if NIS_Nummer=1949 then drug="Azilect";
if NIS_Nummer=1955 then drug="Giotrif";
if NIS_Nummer=1969 then drug="Monovo Emulsion ";
if NIS_Nummer=1969 then drug2="Xamiol Gel ";
if NIS_Nummer=1969 then drug3="Daivobet Gel";
if NIS_Nummer=1988 then drug="Naropin";
if NIS_Nummer=1990 then drug="Deximune";
if NIS_Nummer=2027 then drug="Grippostad C";
if NIS_Nummer=2046 then drug="Extavia";
run;


Dear community,

im stuck again and dont know how to write this command.

I have a data set with different clinical trials of pharmaceuticals. Each trial has a NIS Number and sometimes there was only one drug that was tested, some trials tested up to 8 different drugs.

Now I want to create a new variable, called "unique drugs". I need to know how many different drugs were tested.

 

For example:

Trial No. 1 Nis_Number=1234, drug = ASS

Trial No.2 Nis_Number=1235, drug= ibuprofen, drug2=Paracetamol

Trial No.3 Nis_Number=1236, drug=.

trial No. 4 Nis_Number=1237, drug=Apixaban, drug2=Eliquis, drug3=Ibuprofen, drug4=Brillique, drug5=ASS

 

How can I count the amount of unique drugs?

E.g. I want to know how many Trials tested ASS and how many trials was Ibuprofen tested.

Thank you so much in advance

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @marysmith,

 

Unfortunately, I have never seen or used that data2datastep macro yet. But your macro call (the RUN statement is unnecessary) would imply that you have a dataset named drugs in a library named drugs_new and you want to create an external file named uniquedrugs (presumably without a suffix such as .txt) containing SAS code for creating a copy of dataset drugs_new.drugs.
Maybe that's not what you really want.

 

Can you please be more specific what you mean by "it didn't work"? Error messages in the log will be helpful to understand what went wrong.

 

As to your initial post, you are asking two different questions:

  1. You want to count unique drugs. Is this within a trial, so that, in your example, the new variable unique_drugs should receive values 1, 2, 0 and 5, respectively, for trials 1, 2, 3 and 4? Or is it across trials?
  2. You want to count trials involving specific drugs.

 

The simplified code example below demonstrates (part of) what you can do with the "transpose" and "count(distinct ...)" techniques @Kurt_Bremser suggested.

 

/* Create test data */

data have;
input trial NIS_Nummer drug $ drug2 $ drug3 $;
cards;
1 1234 ABC DEF ABC
2 1235 CCC DDD EEE
3 1236 DEF .   .
4 1237 .   .   .
5 1238 CCC DEF .
;

/* Transpose to a long format */

proc transpose data=have out=long(rename=(col1=drug));
by trial NIS_Nummer;
var drug:;
run;

/* Count unique drugs per trial */

proc sql;
select trial, NIS_Nummer, count(distinct drug) as unique_drugs
from long
group by trial, NIS_Nummer;
quit;

/* Count unique trials per drug */

proc sql;
select drug, count(distinct trial) as trials
from long
where drug ne ' '
group by drug;
quit;

 

Caveat: When it comes to counting "unique drugs", you are probably aware of the many issues which can arise if drug names in your data are not standardized (typos, upper/lower case, abbreviations, trade names vs. generic vs. substance names, ...) or missing values are denoted differently ("n/a", blank, ...).

View solution in original post

7 REPLIES 7
Kurt_Bremser
Super User

Transpose the data set to a long format, so you have only one drug variable, and then use count(distinct()).

 

For code, post a data step of your dataset - see my footnotes.

marysmith
Calcite | Level 5

Thank you again for your quick replay! Unfortuantely I dont know how to transpose the data set to a long format 😞 Can you help me with this? 🙂

marysmith
Calcite | Level 5
Okay thank you I will try this!
marysmith
Calcite | Level 5

I tried it like this but it didnt work 😞

What did I do wrong? Can you please tell me?

 

%data2datastep(drugs, drugs_new, uniquedrugs);
    run;

 

FreelanceReinh
Jade | Level 19

Hi @marysmith,

 

Unfortunately, I have never seen or used that data2datastep macro yet. But your macro call (the RUN statement is unnecessary) would imply that you have a dataset named drugs in a library named drugs_new and you want to create an external file named uniquedrugs (presumably without a suffix such as .txt) containing SAS code for creating a copy of dataset drugs_new.drugs.
Maybe that's not what you really want.

 

Can you please be more specific what you mean by "it didn't work"? Error messages in the log will be helpful to understand what went wrong.

 

As to your initial post, you are asking two different questions:

  1. You want to count unique drugs. Is this within a trial, so that, in your example, the new variable unique_drugs should receive values 1, 2, 0 and 5, respectively, for trials 1, 2, 3 and 4? Or is it across trials?
  2. You want to count trials involving specific drugs.

 

The simplified code example below demonstrates (part of) what you can do with the "transpose" and "count(distinct ...)" techniques @Kurt_Bremser suggested.

 

/* Create test data */

data have;
input trial NIS_Nummer drug $ drug2 $ drug3 $;
cards;
1 1234 ABC DEF ABC
2 1235 CCC DDD EEE
3 1236 DEF .   .
4 1237 .   .   .
5 1238 CCC DEF .
;

/* Transpose to a long format */

proc transpose data=have out=long(rename=(col1=drug));
by trial NIS_Nummer;
var drug:;
run;

/* Count unique drugs per trial */

proc sql;
select trial, NIS_Nummer, count(distinct drug) as unique_drugs
from long
group by trial, NIS_Nummer;
quit;

/* Count unique trials per drug */

proc sql;
select drug, count(distinct trial) as trials
from long
where drug ne ' '
group by drug;
quit;

 

Caveat: When it comes to counting "unique drugs", you are probably aware of the many issues which can arise if drug names in your data are not standardized (typos, upper/lower case, abbreviations, trade names vs. generic vs. substance names, ...) or missing values are denoted differently ("n/a", blank, ...).

Kurt_Bremser
Super User

@marysmith wrote:

I tried it like this but it didnt work 😞

What did I do wrong? Can you please tell me?

 

%data2datastep(drugs, drugs_new, uniquedrugs);
    run;

 


When something "does not work", always post the log.

 

The macro accepts up to 4 positional parameters:

  1. dataset name within the library - required
  2. library name - optional (if none given, will default to WORK)
  3. fully qualified filename of file to be created - optional. Default is "create_&lib._&dsn._data.sas" in current working directory.
  4. number of observations - optional

Since you specified no path for the output file, the macro probably tried to write to a location where you don't have write permission.

In case of SAS UE, I'd use a filename like /folders/myfolder/create_dataset.sas. After this is created, you can pick it up from the Windows/Linux/MacOS side with any text editor and copy/paste the contents here.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 835 views
  • 0 likes
  • 3 in conversation