Hello,
I would like some help in subtracting multiple variables in an observation from another.
GEO YEAR SEX Total AG_1 ....... AG_100
0199 2015 1 3543 32 ..............564
0101 2015 1 9764 42...............454
6099 2015 1 1356 87...............457
0199 2016 2 9878 77 ..............564
0101 2016 2 6565 65...............454
6099 2016 2 1111 22...............457
0199 2017 3 7865 11 ..............564
0101 2017 3 2343 22...............454
6099 2017 3 5376 83...............457
For each year and each sex, I would like to subtract the Total and AG_1 to AG_100 variables from the GEO 0199 and 0101.
I would also like to set all 6099 Total and AG_1...AG_100 variables to 0 after.
Any help would be greatly appreciated.
Many thanks!
You can decide which observations to do the subtraction.
if geo in (6099,199,101) then do i=1 to dim(original);
You should only change the one array. For the COPY array it doesn't matter what names you use. You can even make it larger than it needs to be. You just can't make it smaller than the other array. You cannot use a positional variable list because you are using the ARRAY statement to make NEW variables, they don't exist yet so they don't have any position yet.
array copy x_1 - x_1000 ;
I'm not sure this is what youre asking, but perhaps something like this?
data have;
input GEO YEAR SEX Total AG_1 AG_2 AG_3;
datalines;
0199 2015 1 3543 32 564 564
0101 2015 1 9764 42 454 454
6099 2015 1 1356 87 457 457
0199 2016 2 9878 77 564 564
0101 2016 2 6565 65 454 454
6099 2016 2 1111 22 457 457
0199 2017 3 7865 11 564 564
0101 2017 3 2343 22 454 454
6099 2017 3 5376 83 457 457
;
data want;
set have;
array _ Total--AG_3;
if GEO in ('0199', '0101') then Total=Total-sum(of AG:);
else if GEO='6099' then call stdize('replace','mult=',0, of _[*]);
run;
Result:
geo year sex Total ag_1 ag_2 ag_3 199 2015 1 2383 32 564 564 101 2015 1 8814 42 454 454 6099 2015 1 0 0 0 0 199 2016 2 8673 77 564 564 101 2016 2 5592 65 454 454 6099 2016 2 0 0 0 0 199 2017 3 6726 11 564 564 101 2017 3 1413 22 454 454 6099 2017 3 0 0 0 0
So you want to take the rows where GEO=6999 and MERGE them with the original data based on YEAR and SEX values?
Then subtract the values from the GEO=6999 variables from every row?
It might be best to stick them into a separate ARRAY.
data want ;
do until(last.sex);
set have ;
by year sex ;
array original total AG_1-AG_100 ;
array copy x_total x_ag_1-x_ag_100;
if geo=6099 then do i=1 to dim(original);
copy[i]=original[i];
end;
end;
do until(last.sex);
set have ;
by year sex ;
do i=1 to dim(original);
original[i]=original[i]-copy[i];
end;
output;
end;
drop i x_: ;
run;
Thanks Tom, very close I think.
The Total variable does the exact thing I want it to 🙂
My 100 ish variable starting with AG don't all have number suffix, they go AG_LT_1 .... AG_90PL
So I changed the "-" to "--" in the array definitions. And now they all get set to 0 somehow.
You can decide which observations to do the subtraction.
if geo in (6099,199,101) then do i=1 to dim(original);
You should only change the one array. For the COPY array it doesn't matter what names you use. You can even make it larger than it needs to be. You just can't make it smaller than the other array. You cannot use a positional variable list because you are using the ARRAY statement to make NEW variables, they don't exist yet so they don't have any position yet.
array copy x_1 - x_1000 ;
This worked great! Many thanks! 🙂
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.