I am trying to make a table that only contains data that is not a 0. Here is an example of the table: there are acutal 2000k rows all together. This is just a snippet
COMPARISON | WHO | SVCD | SVDESC | ADMIT | COUNT |
1010 | 1010 | 2 | MOD | A | 2 |
1010 | 1010 | 2 | MOD | R | 0 |
1010 | NATIONAL | 2 | MOD | A | 41296 |
1010 | NATIONAL | 2 | MOD | R | 1909 |
1012 | 1012 | 2 | MOD | A | 2 |
1012 | 1012 | 2 | MOD | R | 0 |
1012 | NATIONAL | 2 | MOD | A | 41296 |
1012 | NATIONAL | 2 | MOD | R | 1909 |
1013 | 1013 | 2 | MOD | A | 2 |
1013 | 1013 | 2 | MOD | R | 1 |
1013 | NATIONAL | 2 | MOD | A | 41296 |
1013 | NATIONAL | 2 | MOD | R | 1909 |
1014 | 1014 | 0 | N/A | A | 1 |
1014 | 1014 | 0 | N/A | R | 0 |
1014 | NATIONAL | 0 | N/A | A | 12495 |
1014 | NATIONAL | 0 | N/A | R | 52 |
1015 | 1015 | 2 | MOD | A | 2 |
1015 | 1015 | 2 | MOD | R | 0 |
1015 | NATIONAL | 2 | MOD | A | 41296 |
1015 | NATIONAL | 2 | MOD | R | 1909 |
1016 | 1016 | 3 | MAJ | A | 9 |
1016 | 1016 | 3 | MAJ | R | 1 |
1016 | NATIONAL | 3 | MAJ | A | 7592 |
1016 | NATIONAL | 3 | MAJ | R | 863 |
1017 | 1017 | 3 | MAJ | A | 5 |
1017 | 1017 | 3 | MAJ | R | 0 |
1017 | NATIONAL | 3 | MAJ | A | 7592 |
1017 | NATIONAL | 3 | MAJ | R | 863 |
1018 | 1018 | 0 | N/A | A | 1 |
1018 | 1018 | 0 | N/A | R | 0 |
1018 | NATIONAL | 0 | N/A | A | 12495 |
1018 | NATIONAL | 0 | N/A | R | 52 |
1019 | 1019 | 1 | MIN | A | 1 |
1019 | 1019 | 1 | MIN | R | 0 |
1019 | NATIONAL | 1 | MIN | A | 31218 |
1019 | NATIONAL | 1 | MIN | R | 430 |
1020 | 1020 | 4 | E | A | 4 |
1020 | 1020 | 4 | E | R | 0 |
1020 | NATIONAL | 4 | E | A | 1105 |
1020 | NATIONAL | 4 | E | R | 200 |
1021 | 1021 | 2 | MOD | A | 2 |
1021 | 1021 | 2 | MOD | R | 0 |
1021 | NATIONAL | 2 | MOD | A | 41296 |
1021 | NATIONAL | 2 | MOD | R | 1909 |
1022 | 1022 | 1 | MIN | A | 3 |
1022 | 1022 | 1 | MIN | R | 0 |
1022 | NATIONAL | 1 | MIN | A | 31218 |
1022 | NATIONAL | 1 | MIN | R | 430 |
1023 | 1023 | 1 | MIN | A | 2 |
1023 | 1023 | 1 | MIN | R | 0 |
1023 | NATIONAL | 1 | MIN | A | 31218 |
1023 | NATIONAL | 1 | MIN | R | 430 |
1024 | 1024 | 1 | MIN | A | 1 |
1024 | 1024 | 1 | MIN | R | 0 |
1024 | NATIONAL | 1 | MIN | A | 31218 |
1024 | NATIONAL | 1 | MIN | R | 430 |
1025 | 1025 | 2 | MOD | A | 5 |
1025 | 1025 | 2 | MOD | R | 1 |
1025 | NATIONAL | 2 | MOD | A | 41296 |
1025 | NATIONAL | 2 | MOD | R | 1909 |
1026 | 1026 | 3 | MAJ | A | 14 |
1026 | 1026 | 3 | MAJ | R | 1 |
1026 | NATIONAL | 3 | MAJ | A | 7592 |
1026 | NATIONAL | 3 | MAJ | R | 863 |
1027 | 1027 | 2 | MOD | A | 3 |
1027 | 1027 | 2 | MOD | R | 0 |
1027 | NATIONAL | 2 | MOD | A | 41296 |
1027 | NATIONAL | 2 | MOD | R | 1909 |
1028 | 1028 | 3 | MAJ | A | 31 |
1028 | 1028 | 3 | MAJ | R | 0 |
1028 | NATIONAL | 3 | MAJ | A | 7592 |
1028 | NATIONAL | 3 | MAJ | R | 863 |
1029 | 1029 | 4 | E | A | 20 |
1029 | 1029 | 4 | E | R | 0 |
1029 | NATIONAL | 4 | E | A | 1105 |
1029 | NATIONAL | 4 | E | R | 200 |
1030 | 1030 | 1 | MIN | A | 1 |
1030 | 1030 | 1 | MIN | R | 0 |
1030 | NATIONAL | 1 | MIN | A | 31218 |
1030 | NATIONAL | 1 | MIN | R | 430 |
The goal is to do a create table query that give me all the data above but I do not want any information returned if a who or comparison has a ad that =r and the r = 0. This means all the information so that includes say for the first one 1010. The table would have no data on 1010 because the R = 0. I want the A gone as well. I only want A's returned where the R >0 and all other A's if the comparison or who has an R = to 0 all their information is not in the table. So the table when all is said and done should look like this: of course there are 20k rows in all and if the table were really this small I could manually delete but I cannot for 20k rows.
COMPARISON | WHO | SVCD | SVDESC | ADMIT | COUNT |
1013 | 1013 | 2 | MOD | A | 2 |
1013 | 1013 | 2 | MOD | R | 1 |
1013 | NATIONAL | 2 | MOD | A | 41296 |
1013 | NATIONAL | 2 | MOD | R | 1909 |
1016 | 1016 | 3 | MAJ | A | 9 |
1016 | 1016 | 3 | MAJ | R | 1 |
1016 | NATIONAL | 3 | MAJ | A | 7592 |
1016 | NATIONAL | 3 | MAJ | R | 863 |
1025 | 1025 | 2 | MOD | A | 5 |
1025 | 1025 | 2 | MOD | R | 1 |
1025 | NATIONAL | 2 | MOD | A | 41296 |
1025 | NATIONAL | 2 | MOD | R | 1909 |
1026 | 1026 | 3 | MAJ | A | 14 |
1026 | 1026 | 3 | MAJ | R | 1 |
1026 | NATIONAL | 3 | MAJ | A | 7592 |
1026 | NATIONAL | 3 | MAJ | R | 863 |
Have tried
proc sql;
create table as want
(select
a*.,
from have
where (ad = 'r' and count >0)
but that still returns those comparisons and who's that have an A under ad with a value and it just does not return the r portion. I need their whole thing gone as you can see from my example output.
If keen on SQL approach ;
proc sql;
create table want as
select * from have
group by comparison
having sum(upcase(admit)='R' and count=0)=0
;
quit;
Please note, the code above runs much slower than data step on my machine.
Haikuo
2X DOW seems to be one of the approaches:
data have;
input COMPARISON$ WHO$ SVCD SVDESC$ ADMIT$ COUNT;
cards;
1010 1010 2 MOD A 2
1010 1010 2 MOD R 0
1010 NATIONAL 2 MOD A 41296
1010 NATIONAL 2 MOD R 1909
1012 1012 2 MOD A 2
1012 1012 2 MOD R 0
1012 NATIONAL 2 MOD A 41296
1012 NATIONAL 2 MOD R 1909
1013 1013 2 MOD A 2
1013 1013 2 MOD R 1
1013 NATIONAL 2 MOD A 41296
1013 NATIONAL 2 MOD R 1909
1014 1014 0 N/A A 1
1014 1014 0 N/A R 0
1014 NATIONAL 0 N/A A 12495
1014 NATIONAL 0 N/A R 52
1015 1015 2 MOD A 2
1015 1015 2 MOD R 0
1015 NATIONAL 2 MOD A 41296
1015 NATIONAL 2 MOD R 1909
1016 1016 3 MAJ A 9
1016 1016 3 MAJ R 1
1016 NATIONAL 3 MAJ A 7592
1016 NATIONAL 3 MAJ R 863
1017 1017 3 MAJ A 5
1017 1017 3 MAJ R 0
1017 NATIONAL 3 MAJ A 7592
1017 NATIONAL 3 MAJ R 863
1018 1018 0 N/A A 1
1018 1018 0 N/A R 0
1018 NATIONAL 0 N/A A 12495
1018 NATIONAL 0 N/A R 52
1019 1019 1 MIN A 1
1019 1019 1 MIN R 0
1019 NATIONAL 1 MIN A 31218
1019 NATIONAL 1 MIN R 430
1020 1020 4 E A 4
1020 1020 4 E R 0
1020 NATIONAL 4 E A 1105
1020 NATIONAL 4 E R 200
1021 1021 2 MOD A 2
1021 1021 2 MOD R 0
1021 NATIONAL 2 MOD A 41296
1021 NATIONAL 2 MOD R 1909
1022 1022 1 MIN A 3
1022 1022 1 MIN R 0
1022 NATIONAL 1 MIN A 31218
1022 NATIONAL 1 MIN R 430
1023 1023 1 MIN A 2
1023 1023 1 MIN R 0
1023 NATIONAL 1 MIN A 31218
1023 NATIONAL 1 MIN R 430
1024 1024 1 MIN A 1
1024 1024 1 MIN R 0
1024 NATIONAL 1 MIN A 31218
1024 NATIONAL 1 MIN R 430
1025 1025 2 MOD A 5
1025 1025 2 MOD R 1
1025 NATIONAL 2 MOD A 41296
1025 NATIONAL 2 MOD R 1909
1026 1026 3 MAJ A 14
1026 1026 3 MAJ R 1
1026 NATIONAL 3 MAJ A 7592
1026 NATIONAL 3 MAJ R 863
1027 1027 2 MOD A 3
1027 1027 2 MOD R 0
1027 NATIONAL 2 MOD A 41296
1027 NATIONAL 2 MOD R 1909
1028 1028 3 MAJ A 31
1028 1028 3 MAJ R 0
1028 NATIONAL 3 MAJ A 7592
1028 NATIONAL 3 MAJ R 863
1029 1029 4 E A 20
1029 1029 4 E R 0
1029 NATIONAL 4 E A 1105
1029 NATIONAL 4 E R 200
1030 1030 1 MIN A 1
1030 1030 1 MIN R 0
1030 NATIONAL 1 MIN A 31218
1030 NATIONAL 1 MIN R 430
;
proc sort data=have;
by comparison;
run;
data want (drop=flag);
do until (last.comparison);
set have;
by comparison;
if upcase(admit)='R' and count=0 then flag=1;
end;
do until (last.comparison);
set have;
by comparison;
if flag ne 1 then output;
end;
run;
proc print;run;
Regards,
Haikuo
or maybe some sort of
DATA want; SET have;
IF AD = 'R' AND COUNT = 0 THEN DELETE;
RUN;
but that I fear only deletes the r record that matches and not the entire thing
If keen on SQL approach ;
proc sql;
create table want as
select * from have
group by comparison
having sum(upcase(admit)='R' and count=0)=0
;
quit;
Please note, the code above runs much slower than data step on my machine.
Haikuo
perfecto. Now I will not get erros on my CHI. I have 4896 rows that have 0's and it just throws the chi stuff off in SAS. and i do not need them for final analysis anyway. I have a pivot already off the national so i know what to compare to. thx so much
If you have 0's in your chisq you should be using an exact chisq approach.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.