Hi All,
I have a dataset with 140 variables, i need to concatinate all those variables and create a new variable(location) when the value is '1' with their labels.
Below is the example:
DATA Have;
INPUT VAR1 VAR2 VAR3 VAR4;
label
VAR1 = "loc1"
VAR2 = "loc2"
VAR3 = "loc3"
VAR4 = "loc4"
;
DATALINES;
0 0 0 0
0 1 0 0
1 1 1 1
1 1 0 1
1 0 1 0
0 0 0 1
1 0 0 0
;
RUN;
New created column will be like this:
Location
loc2
loc1,loc2,loc3,loc4
loc1,loc2,loc4
loc1,loc3
loc4
loc1
Can you please guide me how this can be done efficiently.
Thanks,
sasg.
I would use an array and the VLABEL function:
data want;
set have;
array vars var1-var4;
length location $200;
do _N_=1 to dim(vars);
if vars(_N_)=1 then
call catx(',',location,vlabel(vars(_N_)));
end;
run;
Where does the information that var1=loc1 come from? Is that in another table? Is it systematic somehow or is loc1, loc4 not representative of your real data?
Please describe how you intend to display that "label".
If I understand that you potentially will combine 140 labels that could theoretically be 140*256 characters + 139 separators= 35979 characters. Which exceeds the maximum length of a single variable.
Here's a way with your example:
DATA Have; INPUT VAR1 VAR2 VAR3 VAR4; length labelvar $ 100; array v var1-var4; do i=1 to dim(v); if v[i]=1 then labelvar=catx(',',labelvar,strip(vlabel(v[i]))); end;
drop i; label VAR1 = "loc1" VAR2 = "loc2" VAR3 = "loc3" VAR4 = "loc4" ; DATALINES; 0 0 0 0 0 1 0 0 1 1 1 1 1 1 0 1 1 0 1 0 0 0 0 1 1 0 0 0 ;
You will need to assign a length to the labelvar long enough to hold the longest expected result.
@sasg wrote:
Thank you for the response, but in my actual scenario dataset already exist and labels for var1, var2 are loc1 and loc2..instead of specifying labels is there anyway i can read from dataset itself.
SET statement to use an existing data set instead of datalines.
The logic of the Array assignment, creating the labelvar and Do loop are exactly the same
Below some code that dynamically determines the required length for your location variable.
I've made the assumption that all your variables are of the same data type (either all numeric or all character).
DATA work.Have;
INPUT VAR1 VAR2 VAR3 VAR4;
label
VAR1 = "loc1"
VAR2 = "loc2"
VAR3 = "loc3"
VAR4 = "loc4"
;
var5=var4;
DATALINES;
0 0 0 0
0 1 0 0
1 1 1 1
1 1 0 1
1 0 1 0
0 0 0 1
1 0 0 0
;
%let req_len=1;
proc sql noprint;
select max(1,req_len) into :req_len trimmed
from
(
select sum(ifn(lengthn(label)>0,lengthn(label)+1,0)) as req_len
from dictionary.columns
where libname='WORK' and memname='HAVE' and upcase(name) like 'VAR%'
)
;
quit;
data want(drop=_:);
set have;
array a_vars {*} var:;
length Location $&req_len;
do _i=1 to dim(a_vars);
if a_vars[_i]=1 and vlabel(a_vars[_i]) ne vname(a_vars[_i])
then location=catx(',',location,vlabel(a_vars[_i]));
end;
run;
proc print data=want;
run;
I would use an array and the VLABEL function:
data want;
set have;
array vars var1-var4;
length location $200;
do _N_=1 to dim(vars);
if vars(_N_)=1 then
call catx(',',location,vlabel(vars(_N_)));
end;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.