Hello,
I have SAS 9.4.
I have multiple “measures” in columns that start with the same prefix.
For instance, I have 50 columns of the “measure” (roa) in different years, written as follows: roa_n1 , roa_n2, roa_n3 etc, until ....roa_n50).
I have another set of measures (ast), also in 50 columns: ast_n1, ast_n2, ast_n3 etc,…… until ast_n50).
I want to create another measure (another 50 columns) by dividing the roa/ast that have the same suffix. This is I want o create:
ratio_n1 = roa_n1/ast_n1, ratio_n2 = roa_n2/ast_n2, ratio_n3 = roa_n3/ast_n3, etc… until ratio_n50 = roa_n50/ast_n50.
Could you please help me with this?
Thank you !!!
Trasnsposed tables are not ideal to work with You would be better with 3 columns one for each measure, then have rows for the data.
As for your issue, you could use arrays:
data want; set have; array roa_n{50}; array ast_n{50}; array ratio_n{50}; do i=1 to 50; ratio{i}=roa_n{i}/ast_n{i}; end; run;
However you would still be better with a data model:
NUM ROA AST RATIO
1 x y x/y
2 x y x/y
...
Trasnsposed tables are not ideal to work with You would be better with 3 columns one for each measure, then have rows for the data.
As for your issue, you could use arrays:
data want; set have; array roa_n{50}; array ast_n{50}; array ratio_n{50}; do i=1 to 50; ratio{i}=roa_n{i}/ast_n{i}; end; run;
However you would still be better with a data model:
NUM ROA AST RATIO
1 x y x/y
2 x y x/y
...
Please provide some test data, I can't tell from that. Arrays do not delete any variables, therefore something else is happening. Arrays are temporary references to variables, they can create variables if they don't exist. You can get some test data by following:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
Your code looks ok, so I assume the data is at fault, but I cant see this.
It works perfectly , thank you !!!
Not sure why did not work before...
I have one more question that is the next step in the task I am trying to do.
Once I have the 50 ratios : ratio_n1, ratio_n2... ratio_n50, in 50 different columns, I need to get some statistics of all and of different sub-sets of these observations.
For instance the astandard deviation of the first 10 observations.
stdv(ratio_n1, ratio_n2..... ratio_n10)
A second step, which may be more difficult, is to compute the standard deviation only if there are >= 5 non-missing observations.
Thank you again.
Well nmiss() can count missings,:
num_missing=nmiss(of ratio_n{*});
I don't believe there is a standard deviation function however. Again, this is where normalised data is far simpler to work with:
NUM ROA AST RATIO
1 x y x/y
2 x y x/y
...
So:
proc means data=have; by num; var ratio; output out=want stddev=stddev; run;
@Thomas_mp wrote:
It works perfectly , thank you !!!
Not sure why did not work before...
I have one more question that is the next step in the task I am trying to do.
Once I have the 50 ratios : ratio_n1, ratio_n2... ratio_n50, in 50 different columns, I need to get some statistics of all and of different sub-sets of these observations.
For instance the astandard deviation of the first 10 observations.
stdv(ratio_n1, ratio_n2..... ratio_n10)
A second step, which may be more difficult, is to compute the standard deviation only if there are >= 5 non-missing observations.
Thank you again.
It is very poor planning to use a rule like "standard deviation of the first 10 observations". Reason: data sets get sorted all the time and the output of some procedures may reorder the data in some fashion. If the rule does not involve the value of one or more variables such as "idvar is value 3" or "idvar in the range 2 to 5 and timevar < 25" then the programming is going to be obnoxious at best and fragile, hard to maintain, hard to understand and possibly just plain wrong in an unobvious manner at worst. Proc means, summary, tabulate and report can do standard deviations and other statistics.
From your original post: I have 50 columns of the “measure” (roa) in different years: As @RW9 mentioned you might be better to have that actual YEAR value then that could be used as one of the groups. In fact if your groups are all based on that original year it may well be that this whole exercise could likely be cleaned up by maintaining year and grouping by year values.
As far as your requirement to calculate when there are 5 or more nonmissing values when you use Proc Means or summary you 1) request an associated N statistic and 2) filter or set to missing based on the value of that n.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.