You're right, you had already posted info about the characteristics of the variables.
Here's the idea of what needs to happen for the second set of questions. I will use the medication data as an example, since that is required for the most difficult question (number of patients per diagnostic code). That data set contains many observations for some patients, perhaps 0 observations for some patients. And with 3 diagnosis codes, you may need to assign a patient to more than one diagnosis.
First, you need to create up to 3 observations for each existing observation (one observation per diagnosis code). Once you have done that, you need to reduce that data set, eliminating duplicate occurrences of patient + diagnosis code. Then you need to summarize it, getting just one observation per diagnosis code, with number of patients calculated. As a learning experience, I would suggest you try your hand at this, and begin by looking up documentation on the OUTPUT statement. After a while, it it becomes too difficult we can review what you came up with so far.
... View more