10-30-2012 02:04 PM
While merging i used IF A !!!
For each ID the symbol variable spans 35 times in the first dataset. BUT the disease code for that patient ranges from 0-35 meaning he might not have a code or has 35 of the codes.
I am merging this with dataset B which has ID common But has more than 35 count for ID(may be a thousand 101's) because of the splitting of the dollar amount
Also in the Dataset B the dollar amount is broken down for each patient
He might also have less than 35 records
ID symbol code
101 1 abc
101 2 def
ID total broken dollar amt(to satisfy 500 in all 101's)
101 500 20
101 500 15
101 500 100
101 500 so on which totals 500
IF HE HAS ONLY 11 Records in the B dataset he is getting the dollar amount on the 11th record retained 35 times because of the first dataset having 35 observations for each patient
When there are 11 in the Dataset B I want only those 11 and not 35
10-30-2012 02:27 PM
You cannot do a many-to-many merge via data steps if you only have one common key variable (in your case, ID) because SAS does not know how to "match" the say 11 records from B to the 35 records in A?
So you need at least another "key" variable that will help you identify uniquely each record in your data set (s).
For example, it would be very helpful if the dollar amt breakdown would have a code associated with it. Then you could sort both data sets by ID Code, then merge them by ID Code.
PROC SQL would work very well on your problem.:smileycool:
10-30-2012 02:40 PM
I would categorize Hash() as part of Datastep implementation, and with Hash() and enough memory, I can hardly see anything that Proc SQL can do while data step can't.
Just my 2 cents,
10-30-2012 01:43 PM
As long as you have large enough RAM to host one of two merging datasets, Hash() will be among most efficient:
do id=1 to 3;
do var1=1 to 2;
do id=1 to 3;
do var2=3 to 5;
if _n_=1 then do;
if 0 then set h2;
declare hash h2(dataset:'h2', multidata:'y');
do rc=0 by 0 while (rc=0);
10-30-2012 03:43 PM
I will put it this way.
I said if A and A dataset has 5000 observations.
I did a merge of A and B and final dataset has 5000 observations?? and I get that NOTE
Do I need to worry abt it??
10-30-2012 03:47 PM
At this stage, I think it is better for you to lay out what exact outcome you are expecting, preferably with sample tables( both incoming and outcoming). Then we can start from there.
10-30-2012 03:32 PM
The NOTE means that SAS does not know how to "match" the records in the two data sets.
One should never ignore that note and instead perform a correct merge.
10-30-2012 03:45 PM
I have to disagree. The NOTE is merely a reminder: " Wait a sec, are you sure this is what you want?", SAS does KNOW how to proceed under this scenario, and it does it with decent consistency. The only thing one should be certain is that given the predicable behavior of SAS, you know what you are getting.
one can manipulate data step to do what they want, regardless what they want makes any common sense or not.