@beleeve wrote:
Sorry I forgot to mention that the other variable is from only one of the data sets. Right now I have something like this where var 1 and var2 is from both datasets and var 3 is only from data1:
data work.abc; set 'data1.sas7bdat'; set 'data2.sas7bdat';
proc sort; by var1;
proc freq; by var1; table var2*var3/ expected cellchi2 chisq; run;
It is past time that you learned to use libraries, such as WORK instead of file literal names like 'data1.sas7bdat' That just adds to complexity of code.
Second, TWO SET STATEMENTS behave differently than you expect, which is why you lose records.
If you want to do something like that Chisq how do you know that the var2 and var3 are the right combination of observations from your source sets? For chi-square to be meaningful the values have to be matched some way and your double set statements are almost certainly not doing anything sensible.
SHOW example data sets as data step code. Or output proc print of the relevant variables from each set to the LISTING destination and paste the result into a TEXT opened on the forum with the </> icon above the message window.
... View more