So I have a data set of "bins" and quantities. There are many bins, but below should be representative enough. Two bins named "A1" and "B2", each with an initial capacity of 10 and 8 respectively.
data work.foo;
infile datalines dsd delimiter=" ";
length Bin $2 A1 B2 8;
input Bin $ A1 B2;
datalines;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
run;
What I'd like to achieve is for each row decrement the column that is named in the "bin" variable.
Obviously some if-statements would solve this trivial case, but with a large number of column names (which may change) I'd like to avoid that.
I discovered that I can reference the values using something like the vvaluex function (e.g. dynamically select a column to "pull" data from) but I was wondering if there was a similar capability to write to a column specified by the value of another column (e.g. dynamically select a column to "push" data to).
if vvaluex(bin) gt 0
My desired output would be something like the table below:
Bin A1 B2
A1 9 8
A1 8 8
B2 8 7
B2 8 6
A1 7 6
Establish an array for all the variables that might be identified in BIN, to refer to the variable values.
Then initialize a second array of character values, containing the corresponding variable names. Use whichc to identify which element of the variable-name array is in BIN, and decrement the corresponding values array element.
data work.foo;
input Bin $ A1 B2;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
data want (drop=i);
set work.foo;
array values{*} _numeric_;
array vnames {100} $32 _temporary_;
if _n_=1 then do i=1 to dim(values);
vnames{i}=vname(values{i});
end;
values{whichc(bin,of vnames{*})}+(-1);
run;
This program assumes that
i=whichc(bin,of vnames{*});
if i^=0 then values{i}+(-1);
instead ofvalues{whichc(bin,of vnames{*})}+(-1);
data work.foo;
infile datalines dsd delimiter=" ";
length Bin $2 A1 B2 8;
input Bin $ A1 B2;
array b[2] A1 B2;
array d[2] _A1 _B2;
if _n_ = 1 then do;
_a1=A1; _b2=B2;
end;
retain d;
do i = 1 to dim(b);
if vname(b[i]) eq bin then do;
d[i] + -1; leave;
end;
end;
drop i;
datalines;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
run;
proc print; run;
Added LEAVE statement.
Establish an array for all the variables that might be identified in BIN, to refer to the variable values.
Then initialize a second array of character values, containing the corresponding variable names. Use whichc to identify which element of the variable-name array is in BIN, and decrement the corresponding values array element.
data work.foo;
input Bin $ A1 B2;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
data want (drop=i);
set work.foo;
array values{*} _numeric_;
array vnames {100} $32 _temporary_;
if _n_=1 then do i=1 to dim(values);
vnames{i}=vname(values{i});
end;
values{whichc(bin,of vnames{*})}+(-1);
run;
This program assumes that
i=whichc(bin,of vnames{*});
if i^=0 then values{i}+(-1);
instead ofvalues{whichc(bin,of vnames{*})}+(-1);
Tested everything out and works as advertised.
Thanks very much for the assist.
Hello @fergieis,
@fergieis wrote:if vvaluex(bin) gt 0
To ensure that the values don't drop below zero you can use the MAX function:
%let binvars=A1 B2;
data want;
if _n_=1 then set foo(obs=1);
set foo(keep=bin);
array bins[*] &binvars;
_n_=findw("&binvars", trim(bin), " ", "e");
bins[_n_]=max(0,bins[_n_]-1);
run;
There are ways to create the macro variable containing the list of variable names without hardcoding (e.g., using PROC SQL).
The E modifier on FINDW() is interesting. Other useful ones for this might be I, R and perhaps S.
data work.foo;
input Bin $ A1 B2;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
proc sql noprint;
select distinct bin into :bins separated by ' '
from foo
;
quit;
data bar ;
set foo ;
array bins &bins;
_n_ = findw("&bins",bin,' ','sire');
if 0<_n_<=dim(bins) then bins[_n_]=max(0,bins[_n_]-1);
run;
proc compare data=foo compare=bar;
id bin;
run;
Value Comparison Results for Variables __________________________________________________________ || Base Compare Bin || A1 A1 Diff. % Diff ________ || _________ _________ _________ _________ || A1 || 10.0000 9.0000 -1.0000 -10.0000 A1 || 10.0000 9.0000 -1.0000 -10.0000 A1 || 10.0000 9.0000 -1.0000 -10.0000 __________________________________________________________ __________________________________________________________ || Base Compare Bin || B2 B2 Diff. % Diff ________ || _________ _________ _________ _________ || B2 || 8.0000 7.0000 -1.0000 -12.5000 B2 || 8.0000 7.0000 -1.0000 -12.5000 __________________________________________________________
A long dataset layout avoids complex array processing, andy you have the names readily available in a variable:
data work.foo;
infile datalines dlm="09"x dsd truncover;
input Bin $ A1 B2;
row = _n_;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
proc transpose data=foo out=long1;
by row bin;
run;
proc sort data=long1;
by _name_ row;
run;
data long2;
set long1;
by _name_;
retain newcol;
if first._name_
then newcol = col1;
if bin = _name_ then newcol = max(newcol - 1,0);
run;
proc sort data=long2;
by row bin;
run;
proc transpose data=long2 out=want(drop=_name_);
by row bin;
var newcol;
id _name_;
run;
Depending on what you do next, you might even find that keeping the long layout has advantages.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.