So I have a data set of "bins" and quantities. There are many bins, but below should be representative enough. Two bins named "A1" and "B2", each with an initial capacity of 10 and 8 respectively.
data work.foo;
infile datalines dsd delimiter=" ";
length Bin $2 A1 B2 8;
input Bin $ A1 B2;
datalines;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
run;
What I'd like to achieve is for each row decrement the column that is named in the "bin" variable.
Obviously some if-statements would solve this trivial case, but with a large number of column names (which may change) I'd like to avoid that.
I discovered that I can reference the values using something like the vvaluex function (e.g. dynamically select a column to "pull" data from) but I was wondering if there was a similar capability to write to a column specified by the value of another column (e.g. dynamically select a column to "push" data to).
if vvaluex(bin) gt 0
My desired output would be something like the table below:
Bin A1 B2
A1 9 8
A1 8 8
B2 8 7
B2 8 6
A1 7 6
Establish an array for all the variables that might be identified in BIN, to refer to the variable values.
Then initialize a second array of character values, containing the corresponding variable names. Use whichc to identify which element of the variable-name array is in BIN, and decrement the corresponding values array element.
data work.foo;
input Bin $ A1 B2;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
data want (drop=i);
set work.foo;
array values{*} _numeric_;
array vnames {100} $32 _temporary_;
if _n_=1 then do i=1 to dim(values);
vnames{i}=vname(values{i});
end;
values{whichc(bin,of vnames{*})}+(-1);
run;
This program assumes that
i=whichc(bin,of vnames{*});
if i^=0 then values{i}+(-1);
instead ofvalues{whichc(bin,of vnames{*})}+(-1);
data work.foo;
infile datalines dsd delimiter=" ";
length Bin $2 A1 B2 8;
input Bin $ A1 B2;
array b[2] A1 B2;
array d[2] _A1 _B2;
if _n_ = 1 then do;
_a1=A1; _b2=B2;
end;
retain d;
do i = 1 to dim(b);
if vname(b[i]) eq bin then do;
d[i] + -1; leave;
end;
end;
drop i;
datalines;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
run;
proc print; run;
Added LEAVE statement.
Establish an array for all the variables that might be identified in BIN, to refer to the variable values.
Then initialize a second array of character values, containing the corresponding variable names. Use whichc to identify which element of the variable-name array is in BIN, and decrement the corresponding values array element.
data work.foo;
input Bin $ A1 B2;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
data want (drop=i);
set work.foo;
array values{*} _numeric_;
array vnames {100} $32 _temporary_;
if _n_=1 then do i=1 to dim(values);
vnames{i}=vname(values{i});
end;
values{whichc(bin,of vnames{*})}+(-1);
run;
This program assumes that
i=whichc(bin,of vnames{*});
if i^=0 then values{i}+(-1);
instead ofvalues{whichc(bin,of vnames{*})}+(-1);
Tested everything out and works as advertised.
Thanks very much for the assist.
Hello @fergieis,
@fergieis wrote:if vvaluex(bin) gt 0
To ensure that the values don't drop below zero you can use the MAX function:
%let binvars=A1 B2;
data want;
if _n_=1 then set foo(obs=1);
set foo(keep=bin);
array bins[*] &binvars;
_n_=findw("&binvars", trim(bin), " ", "e");
bins[_n_]=max(0,bins[_n_]-1);
run;
There are ways to create the macro variable containing the list of variable names without hardcoding (e.g., using PROC SQL).
The E modifier on FINDW() is interesting. Other useful ones for this might be I, R and perhaps S.
data work.foo;
input Bin $ A1 B2;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
proc sql noprint;
select distinct bin into :bins separated by ' '
from foo
;
quit;
data bar ;
set foo ;
array bins &bins;
_n_ = findw("&bins",bin,' ','sire');
if 0<_n_<=dim(bins) then bins[_n_]=max(0,bins[_n_]-1);
run;
proc compare data=foo compare=bar;
id bin;
run;
Value Comparison Results for Variables
__________________________________________________________
|| Base Compare
Bin || A1 A1 Diff. % Diff
________ || _________ _________ _________ _________
||
A1 || 10.0000 9.0000 -1.0000 -10.0000
A1 || 10.0000 9.0000 -1.0000 -10.0000
A1 || 10.0000 9.0000 -1.0000 -10.0000
__________________________________________________________
__________________________________________________________
|| Base Compare
Bin || B2 B2 Diff. % Diff
________ || _________ _________ _________ _________
||
B2 || 8.0000 7.0000 -1.0000 -12.5000
B2 || 8.0000 7.0000 -1.0000 -12.5000
__________________________________________________________
A long dataset layout avoids complex array processing, andy you have the names readily available in a variable:
data work.foo;
infile datalines dlm="09"x dsd truncover;
input Bin $ A1 B2;
row = _n_;
datalines4;
A1 10 8
A1 10 8
B2 10 8
B2 10 8
A1 10 8
;;;;
proc transpose data=foo out=long1;
by row bin;
run;
proc sort data=long1;
by _name_ row;
run;
data long2;
set long1;
by _name_;
retain newcol;
if first._name_
then newcol = col1;
if bin = _name_ then newcol = max(newcol - 1,0);
run;
proc sort data=long2;
by row bin;
run;
proc transpose data=long2 out=want(drop=_name_);
by row bin;
var newcol;
id _name_;
run;
Depending on what you do next, you might even find that keeping the long layout has advantages.
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.