DATA Step, Macro, Functions and more

How to create this variable?

Reply
Contributor
Posts: 41

How to create this variable?

[ Edited ]
data arzneimittelgruppen; 
set datensatz_1416_Erstanzeige;
if NIS_Nummer in (6513,6699) then allergy=1; /*N=2*/
if NIS_Nummer in (2047,6613,6623,6725,7030) then antibiotics=1; /*N=5*/
if NIS_Nummer in (3203,6622,6629,6649,6818,6819) then antidiabetic=1; /*N=6*/
if NIS_Nummer in (6653,6736,6771,6775,6837,6847) then antihypertensive=1;  /*N=5*/
if NIS_Nummer in (2407,2507,2532,3396,6635) then antipsychotic=1;  /*N=5*/
if NIS_Nummer in (2069,5266,6131,6703,6719) then antivirals=1; /*?*/ /*N=5*/
if NIS_Nummer in (2046,2289,6468,6626,6711) then biologicals=1; /*N=5*/
if NIS_Nummer in (2230,6709) then contrast=1; /*N=2*/
if NIS_Nummer in (2268,2552,2962,6657,6713,6799) then copd=1; /*N=6*/
if NIS_Nummer in (1851,1856,1969,1990,6194,6521,6784,6849,6880) then dermatology=1; /*N=9*/
if NIS_Nummer in (2087,2311,2590,3431,5080,6417,6419,6611,6618,6625,6652,6666,6673,6677,6683,6696,6720,6721,6815,6857,6858,6870,6876,6877,6878,6884) then registry=1;/*N=26*/
if NIS_Nummer in (2187,6170,6172,6752,6883) then eye=1; /*N=5*/
if NIS_Nummer in (2367,2408,2548,2568,3377,3387,6431,6474,6530,6534,6535,6536,6627,6642,6660,6661,6665,6668,6698,6759,6847,6869,6874,7038) then haematology=1; /*N=24*/
if NIS_Nummer in (1867,3432,4147,6663,6717,6755,6758,6766,6835) then hormone=1;    /*N=9*/
if NIS_Nummer in (1928,1949,2067,2075,2147,2211,2631,2632,5383,6449,6494,6532,6636,6640,6714,6772) then neurological=1; /*N=16*/
if NIS_Nummer in (2188) then nondrug=1; /*N=1*/
if NIS_Nummer in (1888,2027,2309,2314,2467,2527,2837,2980,3186,3612,4685,5451,5584,6471,6507,6637,6659,6669,6682,6694,6710,6735,6741,6773,6862,6889)then nonpres=1;  /*N=26*/
if NIS_Nummer in (1906,1908,1955,2071,2076,2090,2155,2307,2628,2630,2892,3039,3078,3806,4056,4306,5394,6173,6473,6501,6506,6509,6512,6519,6520,6523,6524,6526,6529,6531,
6612,6630,6638,6639,6646,6651,6655,6656,6658,6670,6671,6675,6685,6687,6690,6691,6716,6738,6762,6764,6765,6774,6779,6780,6788,6789,6791,6795,6804,6816,6830,6832,6841,
6868,6872,6873,6879,7036) then oncology=1;         /*N=69*/
if NIS_Nummer in (2228,2269,6609) then orphan=1; /*N=3*/
if NIS_Nummer in (2095,2529,2549,2707,5361,5508,6643,6645,6667,6679,6724,6728,6756,6801,6828,6856,6861,6887) then others=1; /*N=18*/
if NIS_Nummer in (1988,6518,6654,6740,6777,6782,6806,6831,6863,7037)then pain=1; /*N=10*/
if NIS_Nummer in (2389,2608,3598,3774,3820,6751)then urologics=1;  /*N=6*/
run;


data drugs1;
set datensatz_1416_Erstanzeige;
if NIS_Nummer=1847 then drug="Imnovid";
if NIS_Nummer=1851 then drug="Actikerall II";
if NIS_Nummer=1856 then drug="Iluvien";
if NIS_Nummer=1867 then drug="Leuprone HEXAL";
if NIS_Nummer=1888 then drug="Angocin";
if NIS_Nummer=1906 then drug="Gemedac";
if NIS_Nummer=1908 then drug="Taxceus";
if NIS_Nummer=1928 then drug="Botox";
if NIS_Nummer=1949 then drug="Azilect";
if NIS_Nummer=1955 then drug="Giotrif";
if NIS_Nummer=1969 then drug="Monovo Emulsion ";
if NIS_Nummer=1969 then drug2="Xamiol Gel ";
if NIS_Nummer=1969 then drug3="Daivobet Gel";
if NIS_Nummer=1988 then drug="Naropin";
if NIS_Nummer=1990 then drug="Deximune";
if NIS_Nummer=2027 then drug="Grippostad C";
if NIS_Nummer=2046 then drug="Extavia";
run;


Dear community,

im stuck again and dont know how to write this command.

I have a data set with different clinical trials of pharmaceuticals. Each trial has a NIS Number and sometimes there was only one drug that was tested, some trials tested up to 8 different drugs.

Now I want to create a new variable, called "unique drugs". I need to know how many different drugs were tested.

 

For example:

Trial No. 1 Nis_Number=1234, drug = ASS

Trial No.2 Nis_Number=1235, drug= ibuprofen, drug2=Paracetamol

Trial No.3 Nis_Number=1236, drug=.

trial No. 4 Nis_Number=1237, drug=Apixaban, drug2=Eliquis, drug3=Ibuprofen, drug4=Brillique, drug5=ASS

 

How can I count the amount of unique drugs?

E.g. I want to know how many Trials tested ASS and how many trials was Ibuprofen tested.

Thank you so much in advance

Super User
Posts: 9,923

Re: How to create this variable?

Posted in reply to marysmith

Transpose the data set to a long format, so you have only one drug variable, and then use count(distinct()).

 

For code, post a data step of your dataset - see my footnotes.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 41

Re: How to create this variable?

Posted in reply to KurtBremser

Thank you again for your quick replay! Unfortuantely I dont know how to transpose the data set to a long format Smiley Sad Can you help me with this? Smiley Happy

Super User
Posts: 9,923

Re: How to create this variable?

Posted in reply to marysmith

See the second line of my response.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 41

Re: How to create this variable?

Posted in reply to KurtBremser
Okay thank you I will try this!
Contributor
Posts: 41

Re: How to create this variable?

Posted in reply to KurtBremser

I tried it like this but it didnt work Smiley Sad

What did I do wrong? Can you please tell me?

 

%data2datastep(drugs, drugs_new, uniquedrugs);
    run;

 

Trusted Advisor
Posts: 1,163

Re: How to create this variable?

[ Edited ]
Posted in reply to marysmith

Hi @marysmith,

 

Unfortunately, I have never seen or used that data2datastep macro yet. But your macro call (the RUN statement is unnecessary) would imply that you have a dataset named drugs in a library named drugs_new and you want to create an external file named uniquedrugs (presumably without a suffix such as .txt) containing SAS code for creating a copy of dataset drugs_new.drugs.
Maybe that's not what you really want.

 

Can you please be more specific what you mean by "it didn't work"? Error messages in the log will be helpful to understand what went wrong.

 

As to your initial post, you are asking two different questions:

  1. You want to count unique drugs. Is this within a trial, so that, in your example, the new variable unique_drugs should receive values 1, 2, 0 and 5, respectively, for trials 1, 2, 3 and 4? Or is it across trials?
  2. You want to count trials involving specific drugs.

 

The simplified code example below demonstrates (part of) what you can do with the "transpose" and "count(distinct ...)" techniques @KurtBremser suggested.

 

/* Create test data */

data have;
input trial NIS_Nummer drug $ drug2 $ drug3 $;
cards;
1 1234 ABC DEF ABC
2 1235 CCC DDD EEE
3 1236 DEF .   .
4 1237 .   .   .
5 1238 CCC DEF .
;

/* Transpose to a long format */

proc transpose data=have out=long(rename=(col1=drug));
by trial NIS_Nummer;
var drug:;
run;

/* Count unique drugs per trial */

proc sql;
select trial, NIS_Nummer, count(distinct drug) as unique_drugs
from long
group by trial, NIS_Nummer;
quit;

/* Count unique trials per drug */

proc sql;
select drug, count(distinct trial) as trials
from long
where drug ne ' '
group by drug;
quit;

 

Caveat: When it comes to counting "unique drugs", you are probably aware of the many issues which can arise if drug names in your data are not standardized (typos, upper/lower case, abbreviations, trade names vs. generic vs. substance names, ...) or missing values are denoted differently ("n/a", blank, ...).

Super User
Posts: 9,923

Re: How to create this variable?

Posted in reply to marysmith

@marysmith wrote:

I tried it like this but it didnt work Smiley Sad

What did I do wrong? Can you please tell me?

 

%data2datastep(drugs, drugs_new, uniquedrugs);
    run;

 


When something "does not work", always post the log.

 

The macro accepts up to 4 positional parameters:

  1. dataset name within the library - required
  2. library name - optional (if none given, will default to WORK)
  3. fully qualified filename of file to be created - optional. Default is "create_&lib._&dsn._data.sas" in current working directory.
  4. number of observations - optional

Since you specified no path for the output file, the macro probably tried to write to a location where you don't have write permission.

In case of SAS UE, I'd use a filename like /folders/myfolder/create_dataset.sas. After this is created, you can pick it up from the Windows/Linux/MacOS side with any text editor and copy/paste the contents here.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Ask a Question
Discussion stats
  • 7 replies
  • 127 views
  • 0 likes
  • 3 in conversation