Hi All,
I have a dataset with several drugs. The dataset has NDCNUM to identify drugs (One drug can have several NDC numbers), patid, date, cost, hospital visits and other variables.
a) I would like to see how many patients switched from drug A to drug B to drug C..... and vice versa.
b) Create a dataset for patients that are on Drug A to calculate the annual cost, hospital visits etc and similarly have datasets for other drugs.
Sample data set
Data have;
input patid svcdate ndcnum cost Drugtype $5. ;
format svcdate mmddyy8. ;
datalines;
1 11022009 944302602 1500 A
1 11052009 944302802 1580 A
1 11122009 944303002 1590 A
1 11162009 944303202 1600 A
1 11252009 944303402 1554 A
1 12012009 64406091101 3500 B
1 12052009 64406092201 3589 B
1 12102009 64406093301 3691 B
1 12182009 64406094401 3541 B
1 12242009 64406096601 3580 B
1 12262009 64406097701 3661 B
2 11022009 944302602 1500 A
2 11052009 944302802 1580 A
2 11122009 944303002 1590 A
2 11162009 944303202 1600 A
2 11252009 944303402 1554 A
2 12012009 71104091101 3500 C
2 12052009 71104092201 3589 C
2 12102009 71104093301 3691 C
2 12182009 71104094401 3541 C
2 12242009 71104096601 3580 C
2 12262009 71104097701 3661 C
2 11012010 53270027005 4125 D
2 11052010 53270027105 4785 D
2 11102010 53270027106 4321 D
2 11052010 53270027105 4785 D
2 11152010 71104097701 4321 C
2 11252010 71104097701 4785 C
2 11262010 71104097701 4321 C;
run;
For patient 1, the first 5 rows contain NDC code for Drug A the 6 to 11th row has NDC for Drug B. Similarly for patient 2, 12-17th row represent Drug A, 18-23rd row represents Drug C, 24-27th represent Drug D and row 28-30th represent switching back to Drug C.
Testing code is NOT a crime. You should test your data step code before posting it here, to make sure it runs withour ERRORs or WARNINGs and creates a faithful copy of your dataset, in terms of variable attributes and contents.
If WE have to fix that code so that it creates SOMETHING (which may not be YOUR dataset), the whole purpose of using the data step is undermined.
Do you really store your dates like this? If yes, this is the most unfavorable way to do it. You can't do any calculations, can't use SAS functions and formats for dates, and even the sort order is wrong.
This was the sample dataset I created, I have added the format for date variable
It has been a while since I had to work with NDC codes. I remember a constant irritant that such codes include manufacturing company information. Do the codes you show above include that information? Of do you want Drug A made by Company X to be differentiated from Drug A made by Company Y?
If not then you may need to either strip such from the codes you have for your use or tell us how to remove it.
For the analysis, it wont matter who the manufacturer is.
I have added a column in the dataset that represent type the type of drug based on NDC number. I think this will be helpful.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.