I am trying to make a table that only contains data that is not a 0. Here is an example of the table: there are acutal 2000k rows all together. This is just a snippet
COMPARISON | WHO | SVCD | SVDESC | ADMIT | COUNT |
1010 | 1010 | 2 | MOD | A | 2 |
1010 | 1010 | 2 | MOD | R | 0 |
1010 | NATIONAL | 2 | MOD | A | 41296 |
1010 | NATIONAL | 2 | MOD | R | 1909 |
1012 | 1012 | 2 | MOD | A | 2 |
1012 | 1012 | 2 | MOD | R | 0 |
1012 | NATIONAL | 2 | MOD | A | 41296 |
1012 | NATIONAL | 2 | MOD | R | 1909 |
1013 | 1013 | 2 | MOD | A | 2 |
1013 | 1013 | 2 | MOD | R | 1 |
1013 | NATIONAL | 2 | MOD | A | 41296 |
1013 | NATIONAL | 2 | MOD | R | 1909 |
1014 | 1014 | 0 | N/A | A | 1 |
1014 | 1014 | 0 | N/A | R | 0 |
1014 | NATIONAL | 0 | N/A | A | 12495 |
1014 | NATIONAL | 0 | N/A | R | 52 |
1015 | 1015 | 2 | MOD | A | 2 |
1015 | 1015 | 2 | MOD | R | 0 |
1015 | NATIONAL | 2 | MOD | A | 41296 |
1015 | NATIONAL | 2 | MOD | R | 1909 |
1016 | 1016 | 3 | MAJ | A | 9 |
1016 | 1016 | 3 | MAJ | R | 1 |
1016 | NATIONAL | 3 | MAJ | A | 7592 |
1016 | NATIONAL | 3 | MAJ | R | 863 |
1017 | 1017 | 3 | MAJ | A | 5 |
1017 | 1017 | 3 | MAJ | R | 0 |
1017 | NATIONAL | 3 | MAJ | A | 7592 |
1017 | NATIONAL | 3 | MAJ | R | 863 |
1018 | 1018 | 0 | N/A | A | 1 |
1018 | 1018 | 0 | N/A | R | 0 |
1018 | NATIONAL | 0 | N/A | A | 12495 |
1018 | NATIONAL | 0 | N/A | R | 52 |
1019 | 1019 | 1 | MIN | A | 1 |
1019 | 1019 | 1 | MIN | R | 0 |
1019 | NATIONAL | 1 | MIN | A | 31218 |
1019 | NATIONAL | 1 | MIN | R | 430 |
1020 | 1020 | 4 | E | A | 4 |
1020 | 1020 | 4 | E | R | 0 |
1020 | NATIONAL | 4 | E | A | 1105 |
1020 | NATIONAL | 4 | E | R | 200 |
1021 | 1021 | 2 | MOD | A | 2 |
1021 | 1021 | 2 | MOD | R | 0 |
1021 | NATIONAL | 2 | MOD | A | 41296 |
1021 | NATIONAL | 2 | MOD | R | 1909 |
1022 | 1022 | 1 | MIN | A | 3 |
1022 | 1022 | 1 | MIN | R | 0 |
1022 | NATIONAL | 1 | MIN | A | 31218 |
1022 | NATIONAL | 1 | MIN | R | 430 |
1023 | 1023 | 1 | MIN | A | 2 |
1023 | 1023 | 1 | MIN | R | 0 |
1023 | NATIONAL | 1 | MIN | A | 31218 |
1023 | NATIONAL | 1 | MIN | R | 430 |
1024 | 1024 | 1 | MIN | A | 1 |
1024 | 1024 | 1 | MIN | R | 0 |
1024 | NATIONAL | 1 | MIN | A | 31218 |
1024 | NATIONAL | 1 | MIN | R | 430 |
1025 | 1025 | 2 | MOD | A | 5 |
1025 | 1025 | 2 | MOD | R | 1 |
1025 | NATIONAL | 2 | MOD | A | 41296 |
1025 | NATIONAL | 2 | MOD | R | 1909 |
1026 | 1026 | 3 | MAJ | A | 14 |
1026 | 1026 | 3 | MAJ | R | 1 |
1026 | NATIONAL | 3 | MAJ | A | 7592 |
1026 | NATIONAL | 3 | MAJ | R | 863 |
1027 | 1027 | 2 | MOD | A | 3 |
1027 | 1027 | 2 | MOD | R | 0 |
1027 | NATIONAL | 2 | MOD | A | 41296 |
1027 | NATIONAL | 2 | MOD | R | 1909 |
1028 | 1028 | 3 | MAJ | A | 31 |
1028 | 1028 | 3 | MAJ | R | 0 |
1028 | NATIONAL | 3 | MAJ | A | 7592 |
1028 | NATIONAL | 3 | MAJ | R | 863 |
1029 | 1029 | 4 | E | A | 20 |
1029 | 1029 | 4 | E | R | 0 |
1029 | NATIONAL | 4 | E | A | 1105 |
1029 | NATIONAL | 4 | E | R | 200 |
1030 | 1030 | 1 | MIN | A | 1 |
1030 | 1030 | 1 | MIN | R | 0 |
1030 | NATIONAL | 1 | MIN | A | 31218 |
1030 | NATIONAL | 1 | MIN | R | 430 |
The goal is to do a create table query that give me all the data above but I do not want any information returned if a who or comparison has a ad that =r and the r = 0. This means all the information so that includes say for the first one 1010. The table would have no data on 1010 because the R = 0. I want the A gone as well. I only want A's returned where the R >0 and all other A's if the comparison or who has an R = to 0 all their information is not in the table. So the table when all is said and done should look like this: of course there are 20k rows in all and if the table were really this small I could manually delete but I cannot for 20k rows.
COMPARISON | WHO | SVCD | SVDESC | ADMIT | COUNT |
1013 | 1013 | 2 | MOD | A | 2 |
1013 | 1013 | 2 | MOD | R | 1 |
1013 | NATIONAL | 2 | MOD | A | 41296 |
1013 | NATIONAL | 2 | MOD | R | 1909 |
1016 | 1016 | 3 | MAJ | A | 9 |
1016 | 1016 | 3 | MAJ | R | 1 |
1016 | NATIONAL | 3 | MAJ | A | 7592 |
1016 | NATIONAL | 3 | MAJ | R | 863 |
1025 | 1025 | 2 | MOD | A | 5 |
1025 | 1025 | 2 | MOD | R | 1 |
1025 | NATIONAL | 2 | MOD | A | 41296 |
1025 | NATIONAL | 2 | MOD | R | 1909 |
1026 | 1026 | 3 | MAJ | A | 14 |
1026 | 1026 | 3 | MAJ | R | 1 |
1026 | NATIONAL | 3 | MAJ | A | 7592 |
1026 | NATIONAL | 3 | MAJ | R | 863 |
Have tried
proc sql;
create table as want
(select
a*.,
from have
where (ad = 'r' and count >0)
but that still returns those comparisons and who's that have an A under ad with a value and it just does not return the r portion. I need their whole thing gone as you can see from my example output.
If keen on SQL approach ;
proc sql;
create table want as
select * from have
group by comparison
having sum(upcase(admit)='R' and count=0)=0
;
quit;
Please note, the code above runs much slower than data step on my machine.
Haikuo
2X DOW seems to be one of the approaches:
data have;
input COMPARISON$ WHO$ SVCD SVDESC$ ADMIT$ COUNT;
cards;
1010 1010 2 MOD A 2
1010 1010 2 MOD R 0
1010 NATIONAL 2 MOD A 41296
1010 NATIONAL 2 MOD R 1909
1012 1012 2 MOD A 2
1012 1012 2 MOD R 0
1012 NATIONAL 2 MOD A 41296
1012 NATIONAL 2 MOD R 1909
1013 1013 2 MOD A 2
1013 1013 2 MOD R 1
1013 NATIONAL 2 MOD A 41296
1013 NATIONAL 2 MOD R 1909
1014 1014 0 N/A A 1
1014 1014 0 N/A R 0
1014 NATIONAL 0 N/A A 12495
1014 NATIONAL 0 N/A R 52
1015 1015 2 MOD A 2
1015 1015 2 MOD R 0
1015 NATIONAL 2 MOD A 41296
1015 NATIONAL 2 MOD R 1909
1016 1016 3 MAJ A 9
1016 1016 3 MAJ R 1
1016 NATIONAL 3 MAJ A 7592
1016 NATIONAL 3 MAJ R 863
1017 1017 3 MAJ A 5
1017 1017 3 MAJ R 0
1017 NATIONAL 3 MAJ A 7592
1017 NATIONAL 3 MAJ R 863
1018 1018 0 N/A A 1
1018 1018 0 N/A R 0
1018 NATIONAL 0 N/A A 12495
1018 NATIONAL 0 N/A R 52
1019 1019 1 MIN A 1
1019 1019 1 MIN R 0
1019 NATIONAL 1 MIN A 31218
1019 NATIONAL 1 MIN R 430
1020 1020 4 E A 4
1020 1020 4 E R 0
1020 NATIONAL 4 E A 1105
1020 NATIONAL 4 E R 200
1021 1021 2 MOD A 2
1021 1021 2 MOD R 0
1021 NATIONAL 2 MOD A 41296
1021 NATIONAL 2 MOD R 1909
1022 1022 1 MIN A 3
1022 1022 1 MIN R 0
1022 NATIONAL 1 MIN A 31218
1022 NATIONAL 1 MIN R 430
1023 1023 1 MIN A 2
1023 1023 1 MIN R 0
1023 NATIONAL 1 MIN A 31218
1023 NATIONAL 1 MIN R 430
1024 1024 1 MIN A 1
1024 1024 1 MIN R 0
1024 NATIONAL 1 MIN A 31218
1024 NATIONAL 1 MIN R 430
1025 1025 2 MOD A 5
1025 1025 2 MOD R 1
1025 NATIONAL 2 MOD A 41296
1025 NATIONAL 2 MOD R 1909
1026 1026 3 MAJ A 14
1026 1026 3 MAJ R 1
1026 NATIONAL 3 MAJ A 7592
1026 NATIONAL 3 MAJ R 863
1027 1027 2 MOD A 3
1027 1027 2 MOD R 0
1027 NATIONAL 2 MOD A 41296
1027 NATIONAL 2 MOD R 1909
1028 1028 3 MAJ A 31
1028 1028 3 MAJ R 0
1028 NATIONAL 3 MAJ A 7592
1028 NATIONAL 3 MAJ R 863
1029 1029 4 E A 20
1029 1029 4 E R 0
1029 NATIONAL 4 E A 1105
1029 NATIONAL 4 E R 200
1030 1030 1 MIN A 1
1030 1030 1 MIN R 0
1030 NATIONAL 1 MIN A 31218
1030 NATIONAL 1 MIN R 430
;
proc sort data=have;
by comparison;
run;
data want (drop=flag);
do until (last.comparison);
set have;
by comparison;
if upcase(admit)='R' and count=0 then flag=1;
end;
do until (last.comparison);
set have;
by comparison;
if flag ne 1 then output;
end;
run;
proc print;run;
Regards,
Haikuo
or maybe some sort of
DATA want; SET have;
IF AD = 'R' AND COUNT = 0 THEN DELETE;
RUN;
but that I fear only deletes the r record that matches and not the entire thing
If keen on SQL approach ;
proc sql;
create table want as
select * from have
group by comparison
having sum(upcase(admit)='R' and count=0)=0
;
quit;
Please note, the code above runs much slower than data step on my machine.
Haikuo
perfecto. Now I will not get erros on my CHI. I have 4896 rows that have 0's and it just throws the chi stuff off in SAS. and i do not need them for final analysis anyway. I have a pivot already off the national so i know what to compare to. thx so much
If you have 0's in your chisq you should be using an exact chisq approach.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.