I want to Dynamically create new variables and assign values within datastep. The data represent randomly assigned test items. Each observation represents one examinee. Each examinee is assigned a different set of items. The pool of items is numbered in the thousands. There are several variables containing randomly assigned item names and several corresponding variables containing item response values.
*** The data generated by the test administration software would look something like the following ***;
data one;
length var1-var2 $4;
input var1 var2 val1 val2 ;
cards;
M845 M730 3 9
M808 M437 0 9
;
*** The datasetp below does not work but represents what I would like to achieve without:
1. discovering all of the items administered and using nested arrays to assign values
2. using put and include statements to write and read code.
3. I tried PROC FCMP and CALL EXECUTE but could not get the code to work.
***;
Data two;
set one;
array var var1-var2;
array val val1-val2;
do over val;
vvalue(var)=vvalue(val);
end;
*** The final data would look like this ***
Var1 Var2 Val1 Val2 M845 M730 M808 M437
M845 M730 3 9 3 9
M808 M437 0 9 0 9
******;
Thanks in advance for your help
I think the best approach is to make a list of variable names in a string. Then for each VAR , find its position in the string, and write out to a correspondingly named array element. So first you can use a PROC SQL to make a "distinct" list of var names:
Data two;
input var1 $4. var2 :$4. val1 val2;
put (_all_) (=);
datalines;
M845 M730 3 9
M808 M437 0 9
run;
proc sql noprint;
select distinct var into :vlist separated by ' ' from
(select var1 as var from two
union all
select var2 as var from two);
quit;
%put &=vlist;
Now with the macrovar vlist, you (1) have a string to search, and (2) a list of elements names for array OUT:
data want (drop=v);
set two;
array out {*} &vlist;
array var {2} $4;
array val {2};
do v=1 to dim(var);
out{findw("&vlist",trim(var{v}),' ','E')}=val{v};
end;
run;
data _null_;
set want;
put (_all_) (=);
run;
I will compare the processing time of using nested do loops with the findw-index method. I have to loop through 100 variables and compare with 4000 new variable names on about 10,000 records.
do v=1 to dim(var);
out{findw("&vlist",trim(var{v}),' ','E')}=val{v};
end;
Can the proc sql statement be adapted to 100 variables (Var1-Var100) and also add a character prefix to each distinct value? The values in Var1-Var100 are all numerals and need a character prefix to confirm to SAS variable name rules.
Given your reccomended approach, I assume that, Call Execute can not return sas datastep code with the new variable names and values to the 'Calling' datastep. It appears that Call Execute can return sas datasetep code to a spawned datasetp.
Thanks for your help.
Use transpose:
data two;
set one;
array v var: ;
array x val: ;
obs = _n_;
do i = 1 to dim(v);
y = v{i};
z = x{i};
output;
end;
run;
proc transpose data=two out=want(drop=obs _name_);
by obs var: val: ;
id y;
var z;
run;
I will compare the processing time of using nested do loops with the proc transpose method. I have to loop through 100 variables and compare with 4000 new variable names on about 10,000 records.
I also want to add code to sort order the new variables. I am thinking of using a data step with a retain statement before the set statement.
data two;
retain &SortedVarList;
set two;
run;
Given your reccomended approach, I assume that, Call Execute can not return sas datastep code with the new variable names and values to the 'Calling' datastep. It appears that Call Execute can return sas datasetep code to a spawned datasetp.
Thanks for your help.
Keep it simple:
data two;
set one;
array v var: ;
array x val: ;
obs = _n_;
do i = 1 to dim(v);
y = v{i};
z = x{i};
output;
end;
run;
proc sql noprint;
select unique y into :varList separated by " "
from two
order by y;
quit;
proc transpose data=two out=three(drop=obs _name_);
by obs var: val: ;
id y;
var z;
run;
data want;
if 0 then set three(keep=val: var:);
retain &varList;
set three;
run;
Can you post a larger sample data set and output. It's hard to see with what your posting. Please ensure they line up, ie output can be generated from input based on the rules you've specified.
*** I can not post actual data due to our test-security policy, but I can expand to three observations (examinees) with three randomly assigned items and three anserers. ***;
Data one;
length Var1-Var3 $4 Val1-Val3 $1;
input Var1-Var3 Val1-Val3
;
I174 I508 I573 4 3 4
I889 I017 I450 2 5 9
I455 I532 I500 6 2 9
;
***
I am going to play around with of the answers, which will take a few days.
***;
Values in Var1-Var3 define the new variable names.
Values in Val1-Val3 define the values assigned to the new variables.
Does this clarify the task?
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.