How do I write to a new variable based off a match from two datasets?

InspectahDex · Posted 11-06-2020 05:51 PM

I have four original datasets:

Dataset A, B, C, D

I also have a final dataset that combines various elements from those four that I'll call FinalDataset.

There is a variable in all four original datasets and my FinalDataset that is called NPI.

I want to write to a new variable in my FinalDataset called "NPIfound" if the NPI value in the FinalDataset is also found in one of the original datasets.

In other words, If datasetA.NPI = finaldataset.NPI, then output the value "datasetA" to NPIfound.

If that same NPI value is also found in datasetB then I want it to add to NPIfound so that it says "datasetA, datasetB" in the FinalDataset.

Reeza · Posted 11-06-2020 06:09 PM

Did you merge the data sets together using a data step merge or a different method? If you used a merge it's trivial to add that in using the IN option.

InspectahDex · Posted 11-06-2020 06:21 PM

I merged the datasets together using a data step merge. Why is it trivial to add that variable in using a IN option? Sorry, I'm new.

Astounding · Posted 11-06-2020 10:10 PM

Here's a simple way. Assuming all your data sets are already sorted by NPI:

data FinalDataset;
   merge a (in=from_a) b (in=from_b) c (in=from_c) d (in=from_d);
   by NPI;
   NPIfound = cats(from_a, from_b, from_c, from_d);
run;

By choosing the simple route, you don't get exactly what you asked for. Instead, you get a series of 0's and 1's. For example:

1010 represents found in A, not found in B, found in C, not found in D.

0001 represents found in D only.

Adding to the complexity by adjusting the program is again relatively easy, but this version might be easier to work with.

Learning about MERGE and in= variables is a basic tool you will use often. It is worth the time to learn.

How do I write to a new variable based off a match from two datasets?

Re: How do I write to a new variable based off a match from two datasets?

Re: How do I write to a new variable based off a match from two datasets?

Re: How do I write to a new variable based off a match from two datasets?

How do I write to a new variable based off a match from two datasets?

Re: How do I write to a new variable based off a match from two datasets?

Re: How do I write to a new variable based off a match from two datasets?

Re: How do I write to a new variable based off a match from two datasets?

SAS Innovate 2025: Save the Date