BookmarkSubscribeRSS Feed
riemerra
Fluorite | Level 6

I want to Dynamically create new variables and assign values within datastep.  The data represent randomly assigned test items.  Each observation represents one examinee.  Each examinee is assigned a different set of items.  The pool of items is numbered in the thousands.  There are several variables containing randomly assigned item names and several corresponding variables containing item response values. 

 

*** The data generated by the test administration software would look something like the following ***;
data one;
length var1-var2 $4;
input var1 var2 val1 val2 ;
cards;
M845    M730    3       9
M808    M437    0       9
;

 

*** The datasetp below does not work but represents what I would like to achieve without:
1. discovering all of the items administered and using nested arrays to assign values
2. using put and include statements to write and read code.

3. I tried PROC FCMP and CALL EXECUTE but could not get the code to work.
 ***;
Data two;
set one;
array var var1-var2;
array val val1-val2;
do over val;
  vvalue(var)=vvalue(val);
end;

*** The final data would look like this ***
Var1    Var2    Val1 Val2 M845 M730 M808 M437
M845    M730    3       9       3       9
M808    M437    0       9                       0       9
******;

 

Thanks in advance for your help

8 REPLIES 8
mkeintz
PROC Star

I think the best approach is to make a list of variable names in a string.  Then for each VAR , find its position in the string, and write out to a correspondingly named array element.  So first you can use a PROC SQL to make a "distinct" list of var names:

 

Data two;
  input var1 $4. var2 :$4. val1 val2;
  put (_all_) (=);
datalines;
M845    M730    3       9
M808    M437    0       9
run;

proc sql noprint;
  select distinct var into :vlist separated by ' ' from
  (select var1 as var from two
    union all  
  select var2 as var from two);
quit;
%put &=vlist;

 

Now with the macrovar vlist, you (1) have a string to search, and (2) a list of elements names for array OUT:

 

data want (drop=v);
  set two;
  array out {*} &vlist;
  array var {2} $4;
  array val {2};
  do v=1 to dim(var);
    out{findw("&vlist",trim(var{v}),' ','E')}=val{v};
  end;
run;
data _null_;
  set want;
  put (_all_) (=);
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
riemerra
Fluorite | Level 6

I will compare the processing time of using nested do loops with the findw-index method.  I have to loop through 100 variables and compare with 4000 new variable names on about 10,000 records.

 

do v=1 to dim(var);
out{findw("&vlist",trim(var{v}),' ','E')}=val{v};
end;

 

Can the proc sql statement be adapted to 100 variables (Var1-Var100) and also add a character prefix to each distinct value?  The values in Var1-Var100 are all numerals and need a character prefix to confirm to SAS variable name rules.

 

Given your reccomended approach, I assume that, Call Execute can not return sas datastep code with the new variable names and values to the 'Calling' datastep.  It appears that Call Execute can return sas datasetep code to a spawned datasetp.

 

Thanks for your help.

PGStats
Opal | Level 21

Use transpose:

 


data two;
set one;
array v var: ;
array x val: ;
obs = _n_;
do i = 1 to dim(v);
    y = v{i};
    z = x{i};
    output;
    end;
run;

proc transpose data=two out=want(drop=obs _name_);
by obs var: val: ;
id y;
var z;
run;
PG
riemerra
Fluorite | Level 6

I will compare the processing time of using nested do loops with the proc transpose method.  I have to loop through 100 variables and compare with 4000 new variable names on about 10,000 records.

 

I also want to add code to sort order the new variables.  I am thinking of using a data step with a retain statement before the set statement.

 

data two;

  retain &SortedVarList;

  set two;

run;

 

 

Given your reccomended approach, I assume that, Call Execute can not return sas datastep code with the new variable names and values to the 'Calling' datastep.  It appears that Call Execute can return sas datasetep code to a spawned datasetp.

 

Thanks for your help.

PGStats
Opal | Level 21

Keep it simple:

 


data two;
set one;
array v var: ;
array x val: ;
obs = _n_;
do i = 1 to dim(v);
    y = v{i};
    z = x{i};
    output;
    end;
run;

proc sql noprint;
select unique y into :varList separated by " " 
from two
order by y;
quit;
 
proc transpose data=two out=three(drop=obs _name_);
by obs var: val: ;
id y;
var z;
run;

data want;
if 0 then set three(keep=val: var:);
retain &varList;
set three;
run;
PG
Reeza
Super User

Can you post a larger sample data set and output. It's hard to see with what your posting. Please ensure they line up, ie output can be generated from input based on the rules you've specified. 

 

 

riemerra
Fluorite | Level 6

*** I can not post actual data due to our test-security policy, but I can expand to three observations (examinees) with three randomly assigned items and three anserers. ***;

 

Data one;
length Var1-Var3 $4 Val1-Val3 $1;
input Var1-Var3 Val1-Val3
;
I174    I508    I573    4       3       4
I889    I017    I450    2       5       9
I455    I532    I500    6       2       9

;

 

***

I am going to play around with of the answers, which will take a few days.

***;

riemerra
Fluorite | Level 6

Values in Var1-Var3 define the new variable names.

Values in Val1-Val3 define the values assigned to the new variables.

 

Does this clarify the task?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 10082 views
  • 5 likes
  • 4 in conversation