Hello,
I would like to produce a number of final tables that include rows for observations with values of 0.
I have a few tables I need to do this with. All tables have only two columns - the identifier and the value. All tables share the same 1-2 characters as the base identifier, and then each table has additional letters added on. Table 1 adds a-b, Table 2 adds a-c, Table 3 adds a-g, etc. I know this is confusing, let me show you:
The base identifiers are kind of random but they are the same for every table. They are:
A |
| B1 |
| B2 |
| B3 |
| C1 |
| C2 |
| D |
| E1 |
| E2 |
| etc. etc. |
So my desired output for Table 1 (adding a-b to each base identifier) looks like this:
| t1_identifier | total_t1_value |
| Aa | 0 |
| Ab | 12 |
| B1a | 10 |
| B1b | 8 |
| B2a | 0 |
| B2b | 3 |
| B3a | 2 |
| B3b | 0 |
| C1a | 7 |
| etc. etc. |
|
And my desired output for Table 2 (adding a-c to each base identifier) looks like this:
| t2_identifier | total_t2_value |
| Aa | 15 |
| Ab | 0 |
| Ac | 20 |
| B1a | 4 |
| B1b | 7 |
| B1c | 0 |
| B2a | 6 |
| etc. etc. |
Right now my tables are missing the rows with zero values, since there are no observations with that identifier in the original table. They look the same as above, just without the zero rows. For example, my current output for Table 1 is this:
| t1_identifier | total_t1_value |
| Ab | 12 |
| B1a | 10 |
| B1b | 8 |
| B2b | 3 |
| B3a | 2 |
| C1a | 7 |
| etc. etc. |
|
My code to produce this table is really simple, just summing up the total by the identifier:
proc sql;
create table Table_1 as
select t1_identifier, sum(t1_value) as total_t1_value
from master_table
group by t1_identifier
;
quit;
I want to add in rows where the zero values are for each table, so that all possible combinations of the identifier (base identifier * the letter combos for each table) are included for each table. How can I format my final table to include these zero rows?
Thank you in advance for your help! I'm on SAS 9.4 if it matters.
Make data set with all of the identifiers you want in the output.
Use that to left join your data you want to summarize from.
proc sql;
create table Table_1 as
select a.t1_identifier, sum(b.t1_value) as total_t1_value
from setwithids as a
left join
master_table as b
on a.t1_identifier = b.t1_identifier
group by t1_identifier
;
quit;
You should get a MISSING value for the sum. Personally I prefer to have a missing value then actually having a 0 as 0 might be a valid sum from present values.
You don't show how you display the tables.
I would tend to create a custom format to display missing as 0 for output such as proc print, report or tabulate for the variable. Or set OPTIONS MISSING='0'; which would display a 0 instead of the default dot character.
I would also be very unlikely to create individual data sets if people are reading the result. You can SUM a bunch of variables at one pass using Proc MEANS/Summary and then select the variable(s) to display with a report procedure such as Proc Print.
If you are willing to drop proc sql and use proc means/summary instead, here's an example form the documentation using formats to get all combinations of class variables.
data cake;
input LastName $ 1-12 Age 13-14 PresentScore 16-17
TasteScore 19-20 Flavor $ 23-32 Layers 34 ;
datalines;
Orlando 27 93 80 Vanilla 1
Ramey 32 84 72 Rum 2
Goldston 46 68 75 Vanilla 1
Roe 38 79 73 Vanilla 2
Larsen 23 77 84 Chocolate .
Davis 51 86 91 Spice 3
Strickland 19 82 79 Chocolate 1
Nguyen 57 77 84 Vanilla .
Hildenbrand 33 81 83 Chocolate 1
Byron 62 72 87 Vanilla 2
Sanders 26 56 79 Chocolate 1
Jaeger 43 66 74 1
Davis 28 69 75 Chocolate 2
Conrad 69 85 94 Vanilla 1
Walters 55 67 72 Chocolate 2
Rossburger 28 78 81 Spice 2
Matthew 42 81 92 Chocolate 2
Becker 36 62 83 Spice 2
Anderson 27 87 85 Chocolate 1
Merritt 62 73 84 Chocolate 1
;
proc format;
value layerfmt 1='single layer'
2-3='multi-layer'
.='unknown';
value $flvrfmt (notsorted)
'Vanilla'='Vanilla'
'Orange','Lemon'='Citrus'
'Spice'='Spice'
'Rum','Mint','Almond'='Other Flavor';
run;
option missing=0;
proc means data=cake completetypes missing maxdec=3 sum nonobs;
class flavor layers / preloadfmt exclusive order=data;
var TasteScore;
format layers layerfmt. flavor $flvrfmt.;
title 'Taste Score For Number of Layers and Cake Flavors';
run;
options missing=.;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.