- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts,
I am fairly new to SAS and I'm grateful for any input.
I would like to create a 2x2 table in which the number of rows are retrieved from multiple variables in multiple tables in order to calculate sensitivity and specificity.
Specifically, I want to retreive the number of rows in three different columns in three different tables. The three counts reflect the number of records in database 1, database 2 and number of matches between the two databases.
Ideally, I'd like SAS to calculate the missing values and end up with a table like this one:
Database 2 | Yes | No | Total |
Database 1 | |||
Yes | 700 | to be calculated | 800 |
No | to be calculated | to be calculated | to be calculated |
Total | 920 |
However I am already stuck at retrieving the counts. I have tried with the following insert into statement (table3 has already been created) but it doesn't work.
proc sql;
insert into table3 (count1, count2, match)
select count (ID1) from table1,
select count (ID2) from table2,
select count (match) from table3;
Any help will be appreciated.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not really sure what the output table you show is designed to represent, but I can help you get the statistics. Something like this:
data one;
do ID1=1 to 100 by 2;
output;
end;
run;
data two;
do ID2=1 to 100 by 3;
output;
end;
run;
proc sql;
title 'Input Table Row Counts';
select 'One' as Table, count(*) as Count from One
union all
select 'Two',count(*) from Two
;
title 'Table Comparison Counts';
select
sum (id1 and id2) as Match
,sum (sum(missing(id1),missing(id2))) as NoMatch
,count(ID1) as Rows1 'Rows in One'
,count(ID2) as Rows2 'Rows in Two'
,count(*) as JoinRows 'Full Join Rows'
,sum(count(ID1),count(ID2)) as Total 'All Rows'
from one
full join
two
on one.id1=two.id2
;
quit;
Produces this result:
Input Table Row Counts
Table | Count |
---|---|
One | 50 |
Two | 34 |
Match | NoMatch | Rows in One | Rows in Two | Full Join Rows | All Rows |
---|---|---|---|---|---|
17 | 50 | 50 | 34 | 67 | 84 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
We would need to see the data in these multiple tables in order to provide code. Please provide (a portion of) the data as working SAS data step code, which you can type in yourself, or use these instructions. Make sure the code is working. Do not provide data as screen captures or as file attachments.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Not really sure what the output table you show is designed to represent, but I can help you get the statistics. Something like this:
data one;
do ID1=1 to 100 by 2;
output;
end;
run;
data two;
do ID2=1 to 100 by 3;
output;
end;
run;
proc sql;
title 'Input Table Row Counts';
select 'One' as Table, count(*) as Count from One
union all
select 'Two',count(*) from Two
;
title 'Table Comparison Counts';
select
sum (id1 and id2) as Match
,sum (sum(missing(id1),missing(id2))) as NoMatch
,count(ID1) as Rows1 'Rows in One'
,count(ID2) as Rows2 'Rows in Two'
,count(*) as JoinRows 'Full Join Rows'
,sum(count(ID1),count(ID2)) as Total 'All Rows'
from one
full join
two
on one.id1=two.id2
;
quit;
Produces this result:
Input Table Row Counts
Table | Count |
---|---|
One | 50 |
Two | 34 |
Match | NoMatch | Rows in One | Rows in Two | Full Join Rows | All Rows |
---|---|---|---|---|---|
17 | 50 | 50 | 34 | 67 | 84 |
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If that is supposed to a simple cross tab (Think PROC FREQ with two variables) then you are missing on piece of information to complete the table.
YES | NO | ||
YES | A | B-A | B |
NO | C-A | x | C-A+x |
C | B-A+x |
You have given values for A, B and C which will allow you fill in two of the empty cells. But you can then pick any value for X and generate values for the other three cells.
With A=700 and B=800 and C=920 if we set X = 100 then the table will be:
YES | NO | ||
YES | 700 | 100 | 800 |
NO | 220 | 100 | 320 |
920 | 200 |
Do you know the overall N? If so then you do have enough information since C+B-A+x=N which means that X is N-C-B+A.