Hello, I have generated two summary tables in SAS from my data for two different years. I want to identify changes and differences in cell counts. At present, I have merged the two tables using PROC SQL and planned to perform a column-by-column comparison for all grades and totals. While this approach works, I am curious if there is a more efficient method for comparing the counts of two tables and showing the differences. So, I am wondering whether PROC COMPARE could facilitate such a comparison, particularly for two significantly larger count tables?
YEAR 1:
CATEGORY | Grade_1 | Grade_2 | Grade_3 |
A | 3 | 3 | 0 |
B | 2 | 1 | 0 |
C | 0 | 2 | 0 |
D | 0 | 0 | 1 |
E | 4 | 2 | 2 |
total | 9 | 8 | 3 |
YEAR 2:
CATEGORY | Grade_1 | Grade_2 | Grade_3 |
A | 3 | 3 | 0 |
C | 0 | 2 | 0 |
D | 0 | 0 | 1 |
E | 9 | 2 | 2 |
F | 2 | 0 | 0 |
total | 14 | 7 | 3 |
table from join: I'll just do a flag if grade_1 differs grade1a for each row etc.
CATEGORY | Grade_1 | Grade_2 | Grade_3 | Category_year2 | Grade_1a | Grade_2a | Grade_3a |
A | 3 | 3 | 0 | A | 3 | 3 | 0 |
B | 2 | 1 | 0 | ||||
C | 0 | 2 | 0 | C | 0 | 2 | 0 |
D | 0 | 0 | 1 | D | 0 | 0 | 1 |
E | 9 | 2 | 2 | E | 9 | 2 | 2 |
F | 2 | 0 | 0 | ||||
total | 14 | 8 | 3 | total | 14 | 7 | 3 |
As @ballardw pointed out, using option OUT= or other option to output the difference ,and you will see them.
proc compare base=a compare=b brief transpose out=want; id category; run;
Can you explain more clearly what you are trying to compare? When you say "identify changes and differences in cell counts" do you mean you want to compare Category A, Grade 1 from Year 1 with Category A, Grade 1 from Year 2? Or, are you just interested in row or column totals?
Yes, I want to compare Category A, Grade 1 from Year 1 with Category A, Grade 1 from Year 2.
I am not familiar with proc compare, and I personally would probably write some manual code if I knew exactly what I wanted.
That said, it looks like there are some options in proc compare that might get you what you want. I think the "Value comparison results for variables" pages might be of interest to you.
I started with the below, but there are several other options here:
data a;
input category $ grade_1 grade_2 grade_3;
datalines;
A 3 3 0
B 2 1 0
C 0 2 0
D 0 0 1
E 4 2 2
Total 9 8 3
;
RUN;
data b;
input category $ grade_1 grade_2 grade_3;
datalines;
A 3 3 0
C 0 2 0
D 0 0 1
E 9 2 2
F 2 0 0
Total 14 7 3
;
RUN;
proc compare base=a compare=b printall;
run;
Or just for variety, assuming your datasets are of the same dimensions and are named test1 and test2:
proc iml;
use test1;
read all into A; close test1;
use test2;
read all into B; close test2;
title 'differences';
print(A-B);
quit;
...or if you really want to get crazy:
proc iml;
use test1;
read all into A [colname=names];
read all var {category} into rnames;
mattrib A rowname=rnames; close test1;
use test2;
read all into B [colname=names];
mattrib B rowname=rnames; close test2;
A=A-B;
title 'matrix A';
print(A);
names={'category'} || names;
names=names`;
create diffs from rnames A [colname=names];
append from rnames A;
close diffs;
quit;
title 'dataset DIFFS';
proc print data=diffs noobs; run;
I'll look into this because I'm not too familiar with proc iml. The datasets aren't guaranteed to be the same dimensions, at least row wise. They will have same columns
Yes. PROC COMPARE is your best company .
data a; input category $ grade_1 grade_2 grade_3; datalines; A 3 3 0 B 2 1 0 C 0 2 0 D 0 0 1 E 4 2 2 Total 9 8 3 ; RUN; data b; input category $ grade_1 grade_2 grade_3; datalines; A 3 3 0 C 0 2 0 D 0 0 1 E 9 2 2 F 2 0 0 Total 14 7 3 ; RUN; proc compare base=a compare=b brief transpose; id category; run;
COMPARE 过程 比较 WORK.A 与 WORK.B (METHOD=EXACT) 观测的比较结果 category=E: 变量 基准值 比较 差异 差异(%) grade_1 4.000000 9.000000 5.000000 125.000000 category=Total: 变量 基准值 比较 差异 差异(%) grade_1 9.000000 14.000000 5.000000 55.555556 grade_2 8.000000 7.000000 -1.000000 -12.500000 NOTE: 数据集 WORK.A 包含 1 个不在“WORK.B”中的观测。 NOTE: 数据集 WORK.B 包含 1 个不在“WORK.A”中的观测。 NOTE: 下列 2 个变量的值经比较不相等: grade_1 grade_2
I like this. And is there any way to output it into a table for easier reading?
There are many options for output data set creation in Proc Compare. The option OUT=libname.datasetname creates a data set, use the options OUTALL, OUTBASE, OUTCOMP, OUTDIF, OUTNOEQUAL and OUTPERCENT to control output. Note that some of these data set formats are harder to read than the results until you figure them out. Look in the Details section of the online help for descriptions of some of the variables that appear in the data set.
ODS OUTPUT can produce data sets that more closely resemble the output in the results window. Reference the online help for the name s of the objects to request in one or more ODS OUTPUT statements.
As @ballardw pointed out, using option OUT= or other option to output the difference ,and you will see them.
proc compare base=a compare=b brief transpose out=want; id category; run;
This appears to be the solution I needed, as it seems to offer a faster method compared to my manual approach of merging and calculating differences.
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.