BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
dustychair
Pyrite | Level 9

Hi,

This is the first time that I'm writing an array code by myself. Good news, it worked (YAY!). Bad news, it calculated only the first row. However I have 20 rows. Could you help to find what I'm missing. Also, I have only three variables named theta here, so it is easy to write  ex1_1 ex1_2 ex1_3 ex1_4 ex2_1 ex2_2 ex2_3 ex2_4 ex3_1 ex3_2 ex3_3 ex3_4 but when I have 500 thetas, is there an easy way to create variables as ex. The input files I used are attached and the code I used is below.

 

Many thanks

 

 data par;
infile 'C:\cluster_new\mlg1.txt';
input a1 a2 a3 b1 b2 b3 ;
run;
data score;
infile 'C:\cluster_new\mlgs.txt';
input theta1 theta2 theta3;
run;
data all_pars;
merge par score;
run;
data all_pars;
set all_pars;
s1=-(a1+a2+a3)/4;
s2=s1+a1;
s3=s1+a2;
s4=s1+a3;
in1=-(b1+b2+b3)/4;
in2=in1+b1;
in3=in1+b2;
in4=in1+b3;
run;
data all_pars1;
set all_pars;
array t {*} theta1-theta3;
array ex {3,4} ex1_1 ex1_2 ex1_3 ex1_4 ex2_1 ex2_2 ex2_3 ex2_4 ex3_1 ex3_2 ex3_3 ex3_4;
array s {*} s1-s4;
array in {*} in1-in4;
do i=1 to 3;
do j=1 to 4;
ex(i,j)=exp(t(i)*s(j)+in(j));
end;
end;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @dustychair,

 

Your mistake is in the (one-to-one) MERGE step: The one-observation dataset SCORE contributes only missing values to observations no. 2, 3, etc. in this type of merge.

 

I would correct it to:

data all_pars;
if _n_=1 then set score;
set par;
run;

This reads the single observation from dataset SCORE only in the first iteration of the DATA step ("if _n_=1") and doesn't touch these variables afterwards. Since all variables from a SET statement are automatically RETAINed, the theta values are copied to all subsequent observations, as desired.

 

Regarding the (hypothetical) variable list ex1_1 ex1_2 ... ex500_4 (consisting of 2000 items):

 

  1. You can define an array without specifying the individual variable names. For example, your definition
    array s {*} s1-s4;
    is equivalent to
    array s{4};
    because s1, s2, s3, s4 are the default variable names for this array.

    In the case of two- or higher-dimensional arrays the default names use sequential numbers (as for one-dimensional arrays) in row-major order (see documentation). So, if you really need the dimension-specific indices (i, j, ...) in the variable names rather than only in the array references (such as ex{i,j}), you still need to specify the list of names.

  2. It's not difficult to create the long list mentioned above programmatically:
    data _null_;
    length c $16000; /* 500*4*(up to 8) characters: " ex123_4" */
    do i=1 to 500;
      do j=1 to 4;
        c=catx(' ',c,cats('ex',i,'_',j));
      end;
    end;
    call symputx('vlist',c);
    run;
    The list is now available in macro variable VLIST and could be referenced in an ARRAY statement:
    array ex{500,4} &vlist;
  3. However, depending on the purpose, a dataset with 2000+ variables might be unwieldy and it could make more sense to aim at a vertical (long) dataset structure.

View solution in original post

3 REPLIES 3
Reeza
Super User
Arrays run on all rows by default. Check your source data.
FreelanceReinh
Jade | Level 19

Hi @dustychair,

 

Your mistake is in the (one-to-one) MERGE step: The one-observation dataset SCORE contributes only missing values to observations no. 2, 3, etc. in this type of merge.

 

I would correct it to:

data all_pars;
if _n_=1 then set score;
set par;
run;

This reads the single observation from dataset SCORE only in the first iteration of the DATA step ("if _n_=1") and doesn't touch these variables afterwards. Since all variables from a SET statement are automatically RETAINed, the theta values are copied to all subsequent observations, as desired.

 

Regarding the (hypothetical) variable list ex1_1 ex1_2 ... ex500_4 (consisting of 2000 items):

 

  1. You can define an array without specifying the individual variable names. For example, your definition
    array s {*} s1-s4;
    is equivalent to
    array s{4};
    because s1, s2, s3, s4 are the default variable names for this array.

    In the case of two- or higher-dimensional arrays the default names use sequential numbers (as for one-dimensional arrays) in row-major order (see documentation). So, if you really need the dimension-specific indices (i, j, ...) in the variable names rather than only in the array references (such as ex{i,j}), you still need to specify the list of names.

  2. It's not difficult to create the long list mentioned above programmatically:
    data _null_;
    length c $16000; /* 500*4*(up to 8) characters: " ex123_4" */
    do i=1 to 500;
      do j=1 to 4;
        c=catx(' ',c,cats('ex',i,'_',j));
      end;
    end;
    call symputx('vlist',c);
    run;
    The list is now available in macro variable VLIST and could be referenced in an ARRAY statement:
    array ex{500,4} &vlist;
  3. However, depending on the purpose, a dataset with 2000+ variables might be unwieldy and it could make more sense to aim at a vertical (long) dataset structure.
dustychair
Pyrite | Level 9
@FreelanceReinhard, you are awesome! Thank you for being patient with my simple questions and thank you for teaching me. I appreciate you!
Best,

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 4377 views
  • 1 like
  • 3 in conversation