SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dustychair
Pyrite | Level 9

Hi,

This is the first time that I'm writing an array code by myself. Good news, it worked (YAY!). Bad news, it calculated only the first row. However I have 20 rows. Could you help to find what I'm missing. Also, I have only three variables named theta here, so it is easy to write  ex1_1 ex1_2 ex1_3 ex1_4 ex2_1 ex2_2 ex2_3 ex2_4 ex3_1 ex3_2 ex3_3 ex3_4 but when I have 500 thetas, is there an easy way to create variables as ex. The input files I used are attached and the code I used is below.

 

Many thanks

 

 data par;
infile 'C:\cluster_new\mlg1.txt';
input a1 a2 a3 b1 b2 b3 ;
run;
data score;
infile 'C:\cluster_new\mlgs.txt';
input theta1 theta2 theta3;
run;
data all_pars;
merge par score;
run;
data all_pars;
set all_pars;
s1=-(a1+a2+a3)/4;
s2=s1+a1;
s3=s1+a2;
s4=s1+a3;
in1=-(b1+b2+b3)/4;
in2=in1+b1;
in3=in1+b2;
in4=in1+b3;
run;
data all_pars1;
set all_pars;
array t {*} theta1-theta3;
array ex {3,4} ex1_1 ex1_2 ex1_3 ex1_4 ex2_1 ex2_2 ex2_3 ex2_4 ex3_1 ex3_2 ex3_3 ex3_4;
array s {*} s1-s4;
array in {*} in1-in4;
do i=1 to 3;
do j=1 to 4;
ex(i,j)=exp(t(i)*s(j)+in(j));
end;
end;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @dustychair,

 

Your mistake is in the (one-to-one) MERGE step: The one-observation dataset SCORE contributes only missing values to observations no. 2, 3, etc. in this type of merge.

 

I would correct it to:

data all_pars;
if _n_=1 then set score;
set par;
run;

This reads the single observation from dataset SCORE only in the first iteration of the DATA step ("if _n_=1") and doesn't touch these variables afterwards. Since all variables from a SET statement are automatically RETAINed, the theta values are copied to all subsequent observations, as desired.

 

Regarding the (hypothetical) variable list ex1_1 ex1_2 ... ex500_4 (consisting of 2000 items):

 

  1. You can define an array without specifying the individual variable names. For example, your definition
    array s {*} s1-s4;
    is equivalent to
    array s{4};
    because s1, s2, s3, s4 are the default variable names for this array.

    In the case of two- or higher-dimensional arrays the default names use sequential numbers (as for one-dimensional arrays) in row-major order (see documentation). So, if you really need the dimension-specific indices (i, j, ...) in the variable names rather than only in the array references (such as ex{i,j}), you still need to specify the list of names.

  2. It's not difficult to create the long list mentioned above programmatically:
    data _null_;
    length c $16000; /* 500*4*(up to 8) characters: " ex123_4" */
    do i=1 to 500;
      do j=1 to 4;
        c=catx(' ',c,cats('ex',i,'_',j));
      end;
    end;
    call symputx('vlist',c);
    run;
    The list is now available in macro variable VLIST and could be referenced in an ARRAY statement:
    array ex{500,4} &vlist;
  3. However, depending on the purpose, a dataset with 2000+ variables might be unwieldy and it could make more sense to aim at a vertical (long) dataset structure.

View solution in original post

3 REPLIES 3
Reeza
Super User
Arrays run on all rows by default. Check your source data.
FreelanceReinh
Jade | Level 19

Hi @dustychair,

 

Your mistake is in the (one-to-one) MERGE step: The one-observation dataset SCORE contributes only missing values to observations no. 2, 3, etc. in this type of merge.

 

I would correct it to:

data all_pars;
if _n_=1 then set score;
set par;
run;

This reads the single observation from dataset SCORE only in the first iteration of the DATA step ("if _n_=1") and doesn't touch these variables afterwards. Since all variables from a SET statement are automatically RETAINed, the theta values are copied to all subsequent observations, as desired.

 

Regarding the (hypothetical) variable list ex1_1 ex1_2 ... ex500_4 (consisting of 2000 items):

 

  1. You can define an array without specifying the individual variable names. For example, your definition
    array s {*} s1-s4;
    is equivalent to
    array s{4};
    because s1, s2, s3, s4 are the default variable names for this array.

    In the case of two- or higher-dimensional arrays the default names use sequential numbers (as for one-dimensional arrays) in row-major order (see documentation). So, if you really need the dimension-specific indices (i, j, ...) in the variable names rather than only in the array references (such as ex{i,j}), you still need to specify the list of names.

  2. It's not difficult to create the long list mentioned above programmatically:
    data _null_;
    length c $16000; /* 500*4*(up to 8) characters: " ex123_4" */
    do i=1 to 500;
      do j=1 to 4;
        c=catx(' ',c,cats('ex',i,'_',j));
      end;
    end;
    call symputx('vlist',c);
    run;
    The list is now available in macro variable VLIST and could be referenced in an ARRAY statement:
    array ex{500,4} &vlist;
  3. However, depending on the purpose, a dataset with 2000+ variables might be unwieldy and it could make more sense to aim at a vertical (long) dataset structure.
dustychair
Pyrite | Level 9
@FreelanceReinhard, you are awesome! Thank you for being patient with my simple questions and thank you for teaching me. I appreciate you!
Best,

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 5129 views
  • 1 like
  • 3 in conversation