Calcite | Level 5

## Sum and determine the stnd of columns that contain a certain text for each row

I'm providing an example of a dataset I have below:

data test;
input ID Char_Var Jan_week1 Jan_week2 Feb_week3;
datalines;
1 'prod1' 1 2 3
2 'prod2' 1 2 3
;
run;

I want to determine the total of the variables that have week in there, additionally I want to figure out the standard deviation and the change over the first and last week. However, my date variables will not always be like this.. i might have an extra week... it will change over time.

1 ACCEPTED SOLUTION

Accepted Solutions
Super User

## Re: Sum and determine the stnd of columns that contain a certain text for each row

@Melika2 wrote:

I'm providing an example of a dataset I have below:

data test;
input ID Char_Var Jan_week1 Jan_week2 Feb_week3;
datalines;
1 'prod1' 1 2 3
2 'prod2' 1 2 3
;
run;

I want to determine the total of the variables that have week in there, additionally I want to figure out the standard deviation and the change over the first and last week. However, my date variables will not always be like this.. i might have an extra week... it will change over time.

It is generally poor data structure to have data in the variable name. If you really must do so with a word in the name of a variable then start the name with the piece you want to find instead of the end or buried in the middle. Week_Jan1 Week_Jan2 or some such.

You may ask why. When you have a common start to variable names then SAS will allow specific forms of lists to reference them such as Week_:  the colon at the end tells SAS to use all the variables whose names start with Week_.

So the functions like SUM will total the variables (assuming they are all numeric ) with code like:

Tot = sum (of Week_:);

StdDev = std(of week_:) ;

and so on. Then it doesn't matter if you don't know, at least for these functions, what the actual names are as long as they start the same.

"first" and "last" week named the way you show them can raise a lot of questions. Like what happens when your "weeks" cross a calendar year boundary.

Generally you will in the long run find out that having the "week" as an actual date variable and the data structured as

ID <other variables> date  value_variable (s)

is more flexible. A big reason is that SAS provides a fair number of functions and formats to deal with variables whose values are dates. Not so much when the value is hidden in a variable name.

5 REPLIES 5
Super User

## Re: Sum and determine the stnd of columns that contain a certain text for each row

Can you provide more details?

Calcite | Level 5

## Re: Sum and determine the stnd of columns that contain a certain text for each row

yes, so I want a column called total which sums up all the columns that had the variable week in them, and I want a column called stnd deviations that gives the standard deviation between the columns that had week in them, so something like this:

data test;
input ID Char_Var Jan_week1 Jan_week2 Feb_week3 total stnd ;
datalines;
1 'prod1' 1 2 3 6 1
2 'prod2' 1 2 3 6 1
;
run;
Super User

## Re: Sum and determine the stnd of columns that contain a certain text for each row

total = sum(of jan_week:, of feb_week:);
stdn = std(of jan_week:, of feb_week:);

You can either list the variables manually or use short cut reference methods of colons shown above. Unfortunately SAS variable shortcuts rely on the principle of prefix not suffixes.
Super User

## Re: Sum and determine the stnd of columns that contain a certain text for each row

@Melika2 wrote:

I'm providing an example of a dataset I have below:

data test;
input ID Char_Var Jan_week1 Jan_week2 Feb_week3;
datalines;
1 'prod1' 1 2 3
2 'prod2' 1 2 3
;
run;

I want to determine the total of the variables that have week in there, additionally I want to figure out the standard deviation and the change over the first and last week. However, my date variables will not always be like this.. i might have an extra week... it will change over time.

It is generally poor data structure to have data in the variable name. If you really must do so with a word in the name of a variable then start the name with the piece you want to find instead of the end or buried in the middle. Week_Jan1 Week_Jan2 or some such.

You may ask why. When you have a common start to variable names then SAS will allow specific forms of lists to reference them such as Week_:  the colon at the end tells SAS to use all the variables whose names start with Week_.

So the functions like SUM will total the variables (assuming they are all numeric ) with code like:

Tot = sum (of Week_:);

StdDev = std(of week_:) ;

and so on. Then it doesn't matter if you don't know, at least for these functions, what the actual names are as long as they start the same.

"first" and "last" week named the way you show them can raise a lot of questions. Like what happens when your "weeks" cross a calendar year boundary.

Generally you will in the long run find out that having the "week" as an actual date variable and the data structured as

ID <other variables> date  value_variable (s)

is more flexible. A big reason is that SAS provides a fair number of functions and formats to deal with variables whose values are dates. Not so much when the value is hidden in a variable name.

Calcite | Level 5

## Re: Sum and determine the stnd of columns that contain a certain text for each row

This helped. Thank you!
Discussion stats
• 5 replies
• 641 views
• 2 likes
• 3 in conversation