Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Home
- /
- Programming
- /
- SAS Procedures
- /
- How to sum of variable levels by group

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 04-16-2020 05:53 AM
(731 views)

I use SAS Enterprise Guide 7.1.

I have a dataset of Car Models by Year and 38 character variables that identify particular characteristics of a model in a given year. I want a total sum of changes for each variable and for each year-model.

Here is what I think the logic should be:

I want to compare all variables using year-model as a by group, and then, say, if there is 1 level of variable (no change), set a counting variable = 1. If there is a change and there are 2 levels for that by group, set a counting variable = 2. Then, sum the counting variables for each variable, then divide by the total number of observations (missing values are valid) weighted by the number of year-models (2). The remainder will equal the total variable changes.

So a formula could be mod(Total Variable Levels/(Total Obs / Year-Models per by group)).

7 REPLIES 7

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

```
data cars;
INPUT Car $ 1-13 A $ 15-16 B $ 18-19 C $ 21-22 D $ 24-27 E $ 29-30 F $ 32-33 G $ 34-35 H $ 36-37;
datalines;
2229024038001 A3 B2 C3 924 E1 . . H4
2229024038002 A3 B2 C1 1288 E1 . . H4
2229024039002 A3 B3 C1 800 E1 . . .
2229024039003 A3 B2 C1 1288 E1 . . H4
2763003042001 A1 B2 C2 . . . . H4
2763003042002 A1 B3 C2 . . . . H4
595143-004001 A3 B3 C1 886 E1 . . .
595143-004002 A3 B3 C1 1000 E1 . . .
6062623013003 A3 B4 C1 . E1 . . H4
6062623013004 A3 B4 C1 . E1 . . H4
;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Can you show us a small example, *with the desired output* ?

--

Paige Miller

Paige Miller

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Here is the desired output for the data given above.

Total Obs 10 10 10 10 10 10 10 10

Year-model by car (weight) 2 2 2 2 2 2 2 2

Total Variable Levels 5 7 6 8 5 5 5 6

Total Variable Changes 0 2 1 3 0 0 0 1

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I modified the input code to make the by-group more clear.

```
data cars;
INPUT Car $10. Year $4. A $3. B $3. C $3. D $5. E $3. F $3. G $3. H $3.;
datalines;
2229024038 001 A3 B2 C3 924 E1 H4
2229024038 002 A3 B2 C1 1288 E1 H4
2229024039 002 A3 B3 C1 800 E1
2229024039 003 A3 B2 C1 1288 E1 H4
2763003042 001 A1 B2 C2 H4
2763003042 002 A1 B3 C2 H4
595143-004 001 A3 B3 C1 886 E1
595143-004 002 A3 B3 C1 1000 E1
6062623013 003 A3 B4 C1 E1 H4
6062623013 004 A3 B4 C1 E1 H4
;
run;
```

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

@benjamin_2018 wrote:

Here is the desired output for the data given above.

Total Obs 10 10 10 10 10 10 10 10

Year-model by car (weight) 2 2 2 2 2 2 2 2

Total Variable Levels 5 7 6 8 5 5 5 6

Total Variable Changes 0 2 1 3 0 0 0 1

That seems like a very strange result. First it appears to be rotated. It looks like has 8 observations across the page and 4 variables going down. Normally data is presented in the other orientation, like in your example input. Second there is nothing to explain what the different columns (observations) mean or how the rows (variables) relate to the variables in the sample input.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Okay I will try to explain further. Sorry for the confusion and I appreciate any offer to help.

There are 8 variables that I am trying to analyze - A -- H. This is just a subset of the variables, there are 38 in total. In this example there are 8.

Total observations is the sum of all non-missing and missing obs for all variables (10 for all).

Year-model by car is the number of observations in the by group. I think in the input data I use Ver, and by this I meant "version" or in this case year-model.

Total Variable levels is the sum of variable levels by by-group (car). So for variable A (a unique feature of a car, in this case) there were no changes between any year-model, which is the result in the last row of the desired output. For variable B, there were 2 changes - for cars 2229024039 and 2763003042 - between year-models (or model-year might be a better term).

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Taking from other topics in SAS community and a nice dose of SAS help pages/examples, I came up with this solution. Let me know if this solution makes sense or if there is any improvement.

```
%let data = work.cars;
proc freq data = &data (drop= year) nlevels;
by Car;
ods exclude onewayfreqs;
ods output nlevels = out_n_levels
(keep = tablevar nlevels rename = (tablevar = name
nlevels = n_levels));
*to match out_contents;
run;
proc sort data=work.out_n_levels;
by name;
run;
ods trace on;
proc means data=work.out_n_levels n nmiss sum;
class name;
var n_levels;
ods output Summary=Variable_Levels;
run;
ods trace off;
data work.spec_changes;
set work.variable_levels;
Var_Ch=mod(n_levels_Sum,NObs);
run;
```

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.