Dear Madam/Sir,
For each IO industry, I want to multiply numbers in each row for each IO industry with "comp" variable except its own industry (data attached).
For instance, for IO 111CA (farms), comp 0.638957866 x 0.0688737( IO 113FF) + comp 0.761841026 x 0.091003( IO 211) + comp 0.914848186 x 0.0096249 (IO 212) + comp 0.825890645 x 0.000328 (IO 213) + comp 1.0078 x 0.026865 (IO 22) (except IO 111CA, farm). And then repeat the same process for every other industry.
I do not find useful commands. Any help will be highly appreciated to perform this calculation.
Thank you so much.
Joon1
data format
IO | Industry_Description | cyear | Farms | Forestry__fishing__and_related_a | Oil_and_gas_extraction | Mining__except_oil_and_gas | Support_activities_for_mining | Utilities | comp |
111CA | Farms | 1997 | 1.18142 | 0.0590518 | 0.0007557 | 0.0026902 | 0.0012788 | 0.0006635 | 0.727964454 |
113FF | Forestry, fishing, and related activities | 1997 | 0.0688737 | 1.23167 | 0.0003183 | 0.0024397 | 0.0006637 | 0.0003304 | 0.638957866 |
211 | Oil and gas extraction | 1997 | 0.0091003 | 0.006632 | 1.08566 | 0.0158068 | 0.0070102 | 0.0970558 | 0.761841026 |
212 | Mining, except oil and gas | 1997 | 0.0096249 | 0.002073 | 0.0028516 | 1.07108 | 0.0024438 | 0.03242 | 0.914848186 |
213 | Support activities for mining | 1997 | 0.000328 | 0.0001729 | 0.0237757 | 0.0141272 | 1.0078 | 0.002556 | 0.825890645 |
22 | Utilities | 1997 | 0.026865 | 0.0099835 | 0.0194369 | 0.0413922 | 0.0051409 | 1.04719 | 0.961662704 |
The order that you have the statements in a data step makes a difference.
Do you see the difference between:
x=0;
do ....;
x=....
end;
And
do ....;
x=0;
x=...;
end;
Can you just assume that the observations (rows) and variables (columns) are in the same order?
data want;
set have ;
row+1;
array vars farms -- Utilities;
do col=1 to dim(vars);
if col ne row then vars[col]=comp*vars[col];
end;
run;
Thank you so much, Tom.
Due to variable name problem related to import, I changed variable name to ind1-ind66 (row and column have the same industry name as attached). And I want to create weight variable to get column "comp" x column industry value except its own industry and revised your code like below. However, I did not get weight variable and only get "row" and "column" variable.
data want; set have;
row+1;
array vars $ ind1-ind66;
array weight $ ind1-ind66;
do col=1 to dim(vars);
if col ne row then
weight[col]=comp*vars[col];
end;
run;
Any help will be highly appreciated.
Joon1
Why are you defining CHARACTER arrays? If you are going to MULTIPLY the values then the variables must be numeric. Remove the $ from the array statements.
Why are you defining two different arrays that point to the same list of variables? If you only want one set of variables then just define one array to point to them. If you want to make new variables then use new names for the variables in that the other array is pointing to.
Thanks for your kind help, Tom. I got changed values by multiplication using your code below.
data want; set have;
row+1;
array vars ind1-ind66;
do col=1 to dim(vars);
if col ne row then
vars[col]=comp*vars[col];
end;
run;
But how can I obtain sum of changed values? I tried to make another array using sum, but it did not work out.
For instance, for IO 111CA (farms), sum= comp 0.638957866 x 0.0688737( IO 113FF) + comp 0.761841026 x 0.091003( IO 211) + comp 0.914848186 x 0.0096249 (IO 212) + comp 0.825890645 x 0.000328 (IO 213) + comp 1.0078 x 0.026865 (IO 22) (except IO 111CA, farm). And then repeat the same process for every other industry.
Your help will be truly appreciated.
Joon1
Initialize some new variable. Then for each variable in the array add the value that variable should contribute to the new variable. When you have done this for all of the variables in the array then the new variable should have the total. If you want treat missing values as zero then use the SUM(,) function.
See if you can incorporate this idea into your code.
new_var=0;
do col=1 to dim(vars);
new_var=sum(new_var,comp*vars[col]);
end;
Thank you for your kind advice, Tom. I tried following code that is operating, but I did not get the correct summation numbers by year.
It seems that it is difficult to get column sum by year except its own industry in a separate variable. Is there any other way to do this?
data want; set have;
row+1;
array vars ind1-ind65;
do col=1 to dim(vars);
if col ne row then
vars[col]=comp*vars[col];
new_var=0;
new_vars=sum (new_vars,comp*vars[col]); by cyear;
end;
run;
Another way I can think of is transpose data and get row sums. But I face an issue.
(1) I coded ind1-ind65, but after SAS operations, industry character variables are sorted in the order of ind1 ind10-ind19 ind2 ind20-29---- ind7 ind8 ind9.
After transposing the data, I need to get row sums after excluding its own industry, so I need to sort industry variables in the order of ind1 ind2 ---- ind10 ind11----ind65 before transposing the data. Could you please help me to this?
I greatly appreciate your help, Tom.
Joon1
Isn't the data symmetrical? So the sum across row 5 should be the same number as the sum down column 5.
To get the sum of a variable use PROC MEANS.
The order that you have the statements in a data step makes a difference.
Do you see the difference between:
x=0;
do ....;
x=....
end;
And
do ....;
x=0;
x=...;
end;
Dear Tom,
The data is not symmetrical. I check your two approaches in the order of data step and the output is the same (not what I want to get).
(1) Is it possible to get column sums in SAS?
(2) in the transposing option, how I can make the order of industry variable to ind1, ind2, ind3 through ind65?
Thank you so much,
Joon1
I greatly appreciate for your help and time, Tom. I used your basic array statements you suggested and resolved my issues with alteration of raw data in another way.
Joon1
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.