Solved: Do loop for multiple variables with different dimensions

CJD · Posted 06-12-2017 07:05 PM

Using SAS Release 3.6(Basic Edition) in a browser with VirtualBox version 5.1.18

I have some data that I need to take to a mixed procedure. I am struggling with getting the data read correctly.

I realize that the answer may be already posted, but I either could not find it or understand.

As a dummy data set:

Let:

d1 = 23, d2 = 31, a1 = 1, a2 = 2, b1=3, b2 = 6, c1 = 5, c2 = 10

Which would yield:

a = 1.5, b = 4.5, c = 7.5, d = 27

needed answer is (a+b+c)/d = 0.5

SDa = 0.707, SDb = 2.12, SDc = 3.53, and SDd = 5.66

the SD of (a+b+c) = 4.18 and the SD of the needed answer is 0.187

So I tried some code:

Data one; */ order=data;
input Variety $ Treatment $ @;
Do original = 1 to 2;
   input d @; output;
end;
Do alpha = 1 to 2;
   input a @; output;
end;
Do beta = 1 to 2;
   input b @; output;
end;
Do gamma = 1 to 2;
   input c @; output;
end;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
proc print data=one;

which is not what I was hoping for --> hoping for something like this.

Obs Variety Treatment Original d Alpha a Beta b Gamma c

1 Var Treat 1 23 1 1 1 3 1 5

2 Var Treat 2 31 2 2 2 6 2 10

ballardw · Posted 06-12-2017 07:53 PM

You describe operations in terms of variables named a1, a2, b1, b2, c1, c2, d1 and d2.

And then show attempting to read data in terms of Variety and Treatment. How would variety and treatment relate to a1, a2, etc?

And then create Alpha Beta Gamma withour an actual reference and it over all really isn't easy to figure out what you are doing or wanting.

Here is one way to read similar data and then do a transform to get data that looks like your set you say you want:

data have;
 input Var $ Treat $ d1 d2 a1 a2 b1 b2 c1 c2;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
run;

data want;
   set have;
   array _d d1-d2;
   array _a a1-a2;
   array _b b1-b2;
   array _c c1-c2;
   do original = 1 to dim(_d);
      d = _d[original];
      a = _a[original];
      b = _b[original];
      d = _c[original];
      Alpha=original;
      Beta =original;
      Gamma=original;
      output;
   end;
   drop  d1-d2 a1-a2 b1-b2 c1-c2;
run;

If you are looking for a general approach handling more than two levels of "original" this gets ugly real quick and not recommended. you would be much better off reading the data in as:

data should;
   input Variety $ Treatment $ Original d  a  b  c;
datalines;
Var   Treat   1   23   1   3 5
Var   Treat   2   31   2   6 10
;
run;

You also may want to think abou the purpose of 4 variables with identical values. You haven't shown a use/need for Alph Beta or Gamma.

It really is not clear where "SD of the needed answer is 0.187" comes from. The way you define "needed answer" there is only one value and the Standard deviation of any variable with a single value, regardless of how many times it is repeated, is 0.

Assuming SD is your abbreviation for standard deviation.

data summary;
   set have;
   meand = mean(d1,d2);
   meana = mean(a1,a2);
   meanb = mean(b1,b2);
   meanc = mean(c1,c2);
   needed = sum(meana,meanb,meanc)/meand;
   stda = std(a1,a2);
   stdb = std(b1,b2);
   stdc = std(c1,c2);
;
run;

is one way to get those stated values. OR take the set I labeled should into proc means or summary and get the summary statistics for a, b, c and d, then combine the means and recalc. But I still don't know what that last sd is from.

proc summary data=should ;
   var d a b c;
   output out=shouldsum mean= std= / autoname;
run;

Posting code and log messages works better in the forum if you paste them into code boxes openedu using the Forum {i} menu icon at the top of the message box. The main window will reformat text and may end up with html or other hidden characters that prevent data step code from running properly.

View solution in original post

ballardw · Posted 06-12-2017 07:53 PM

You describe operations in terms of variables named a1, a2, b1, b2, c1, c2, d1 and d2.

And then show attempting to read data in terms of Variety and Treatment. How would variety and treatment relate to a1, a2, etc?

And then create Alpha Beta Gamma withour an actual reference and it over all really isn't easy to figure out what you are doing or wanting.

Here is one way to read similar data and then do a transform to get data that looks like your set you say you want:

data have;
 input Var $ Treat $ d1 d2 a1 a2 b1 b2 c1 c2;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
run;

data want;
   set have;
   array _d d1-d2;
   array _a a1-a2;
   array _b b1-b2;
   array _c c1-c2;
   do original = 1 to dim(_d);
      d = _d[original];
      a = _a[original];
      b = _b[original];
      d = _c[original];
      Alpha=original;
      Beta =original;
      Gamma=original;
      output;
   end;
   drop  d1-d2 a1-a2 b1-b2 c1-c2;
run;

If you are looking for a general approach handling more than two levels of "original" this gets ugly real quick and not recommended. you would be much better off reading the data in as:

data should;
   input Variety $ Treatment $ Original d  a  b  c;
datalines;
Var   Treat   1   23   1   3 5
Var   Treat   2   31   2   6 10
;
run;

You also may want to think abou the purpose of 4 variables with identical values. You haven't shown a use/need for Alph Beta or Gamma.

It really is not clear where "SD of the needed answer is 0.187" comes from. The way you define "needed answer" there is only one value and the Standard deviation of any variable with a single value, regardless of how many times it is repeated, is 0.

Assuming SD is your abbreviation for standard deviation.

data summary;
   set have;
   meand = mean(d1,d2);
   meana = mean(a1,a2);
   meanb = mean(b1,b2);
   meanc = mean(c1,c2);
   needed = sum(meana,meanb,meanc)/meand;
   stda = std(a1,a2);
   stdb = std(b1,b2);
   stdc = std(c1,c2);
;
run;

is one way to get those stated values. OR take the set I labeled should into proc means or summary and get the summary statistics for a, b, c and d, then combine the means and recalc. But I still don't know what that last sd is from.

proc summary data=should ;
   var d a b c;
   output out=shouldsum mean= std= / autoname;
run;

Posting code and log messages works better in the forum if you paste them into code boxes openedu using the Forum {i} menu icon at the top of the message box. The main window will reformat text and may end up with html or other hidden characters that prevent data step code from running properly.

CJD · Posted 06-13-2017 10:38 AM

Thank you for the answers. I now have the tools I need to continue. The last SD standard deviation comes from sd(x3) = x3 * ((sd(x1)/x1)^2+(sd(x2)/x2)^2)0.5 when x3=x1/x2. In any case you have been very helpful and I am thankful.

Do loop for multiple variables with different dimensions

Re: Do loop for multiple variables with different dimensions

Re: Do loop for multiple variables with different dimensions

Re: Do loop for multiple variables with different dimensions

Do loop for multiple variables with different dimensions

Re: Do loop for multiple variables with different dimensions

Re: Do loop for multiple variables with different dimensions

Re: Do loop for multiple variables with different dimensions

Ready to join fellow brilliant minds for the SAS Hackathon?

Click image to register for webinar

Classroom Training Available!