Help using Base SAS procedures

Do loop for multiple variables with different dimensions

Accepted Solution Solved
Reply
New Contributor CJD
New Contributor
Posts: 2
Accepted Solution

Do loop for multiple variables with different dimensions

Using SAS Release 3.6(Basic Edition) in a browser with VirtualBox version 5.1.18

 

I have some data that I need to take to a mixed procedure. I am struggling with getting the data read correctly.

I realize that the answer may be already posted, but I either could not find it or understand.

As a dummy data set:

Let:

d1 = 23, d2 = 31, a1 = 1, a2 = 2, b1=3, b2 = 6, c1 = 5, c2 = 10

Which would yield:

a = 1.5, b = 4.5, c = 7.5, d = 27

needed answer is (a+b+c)/d = 0.5

SDa = 0.707, SDb = 2.12, SDc = 3.53, and SDd = 5.66

the SD of (a+b+c) = 4.18 and the SD of the needed answer is 0.187

 

So I tried some code:

Data one; */ order=data;
input Variety $ Treatment $ @;
Do original = 1 to 2;
    input d @; output;
end;
Do alpha = 1 to 2;
    input a @; output;
end;
Do beta = 1 to 2;
    input b @; output;
end;
Do gamma = 1 to 2;
    input c @; output;
end;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
proc print data=one;

 

which is not what I was hoping for --> hoping for something like this.

Obs Variety Treatment Original d Alpha a Beta b Gamma c

1      Var       Treat         1          23   1    1   1    3     1        5

2      Var       Treat         2          31   2     2   2   6     2        10


Accepted Solutions
Solution
‎06-13-2017 10:33 AM
Super User
Posts: 11,343

Re: Do loop for multiple variables with different dimensions

You describe operations in terms of variables named a1, a2, b1, b2, c1, c2, d1 and d2.

And then show attempting to read data in terms of Variety and Treatment. How would variety and treatment relate to a1, a2, etc?

And then create Alpha Beta Gamma withour an actual reference and it over all really isn't easy to figure out what you are doing or wanting.

 

Here is one way to read similar data and then do a transform to get data that looks like your set you say you want:

data have;
 input Var $ Treat $ d1 d2 a1 a2 b1 b2 c1 c2;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
run;

data want;
   set have;
   array _d d1-d2;
   array _a a1-a2;
   array _b b1-b2;
   array _c c1-c2;
   do original = 1 to dim(_d);
      d = _d[original];
      a = _a[original];
      b = _b[original];
      d = _c[original];
      Alpha=original;
      Beta =original;
      Gamma=original;
      output;
   end;
   drop  d1-d2 a1-a2 b1-b2 c1-c2;
run;

If you are looking for a general approach handling more than two levels of "original" this gets ugly real quick and not recommended. you would be much better off reading the data in as:

 

data should;
   input Variety $ Treatment $ Original d  a  b  c;
datalines;
Var   Treat   1   23   1   3 5
Var   Treat   2   31   2   6 10
;
run;

 

You also may want to think abou the purpose of 4 variables with identical values. You haven't shown a use/need for Alph Beta or Gamma.

 

It really is not clear where "SD of the needed answer is 0.187" comes from. The way you define "needed answer" there is only one value and the Standard deviation of any variable with a single value, regardless of how many times it is repeated, is 0.

 

Assuming SD is your abbreviation for standard deviation.

data summary;
   set have;
   meand = mean(d1,d2);
   meana = mean(a1,a2);
   meanb = mean(b1,b2);
   meanc = mean(c1,c2);
   needed = sum(meana,meanb,meanc)/meand;
   stda = std(a1,a2);
   stdb = std(b1,b2);
   stdc = std(c1,c2);
;
run;

is one way to get those stated values. OR take the set I labeled should into proc means or summary and get the summary statistics for a, b, c and d, then combine the means and recalc. But I still don't know what that last sd is from.

 

 

proc summary data=should ;
   var d a b c;
   output out=shouldsum mean= std= / autoname;
run;

Posting code and log messages works better in the forum if you paste them into code boxes openedu using the Forum {i} menu icon at the top of the message box. The main window will reformat text and may end up with html or other hidden characters that prevent data step code from running properly.

 

View solution in original post


All Replies
Solution
‎06-13-2017 10:33 AM
Super User
Posts: 11,343

Re: Do loop for multiple variables with different dimensions

You describe operations in terms of variables named a1, a2, b1, b2, c1, c2, d1 and d2.

And then show attempting to read data in terms of Variety and Treatment. How would variety and treatment relate to a1, a2, etc?

And then create Alpha Beta Gamma withour an actual reference and it over all really isn't easy to figure out what you are doing or wanting.

 

Here is one way to read similar data and then do a transform to get data that looks like your set you say you want:

data have;
 input Var $ Treat $ d1 d2 a1 a2 b1 b2 c1 c2;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
run;

data want;
   set have;
   array _d d1-d2;
   array _a a1-a2;
   array _b b1-b2;
   array _c c1-c2;
   do original = 1 to dim(_d);
      d = _d[original];
      a = _a[original];
      b = _b[original];
      d = _c[original];
      Alpha=original;
      Beta =original;
      Gamma=original;
      output;
   end;
   drop  d1-d2 a1-a2 b1-b2 c1-c2;
run;

If you are looking for a general approach handling more than two levels of "original" this gets ugly real quick and not recommended. you would be much better off reading the data in as:

 

data should;
   input Variety $ Treatment $ Original d  a  b  c;
datalines;
Var   Treat   1   23   1   3 5
Var   Treat   2   31   2   6 10
;
run;

 

You also may want to think abou the purpose of 4 variables with identical values. You haven't shown a use/need for Alph Beta or Gamma.

 

It really is not clear where "SD of the needed answer is 0.187" comes from. The way you define "needed answer" there is only one value and the Standard deviation of any variable with a single value, regardless of how many times it is repeated, is 0.

 

Assuming SD is your abbreviation for standard deviation.

data summary;
   set have;
   meand = mean(d1,d2);
   meana = mean(a1,a2);
   meanb = mean(b1,b2);
   meanc = mean(c1,c2);
   needed = sum(meana,meanb,meanc)/meand;
   stda = std(a1,a2);
   stdb = std(b1,b2);
   stdc = std(c1,c2);
;
run;

is one way to get those stated values. OR take the set I labeled should into proc means or summary and get the summary statistics for a, b, c and d, then combine the means and recalc. But I still don't know what that last sd is from.

 

 

proc summary data=should ;
   var d a b c;
   output out=shouldsum mean= std= / autoname;
run;

Posting code and log messages works better in the forum if you paste them into code boxes openedu using the Forum {i} menu icon at the top of the message box. The main window will reformat text and may end up with html or other hidden characters that prevent data step code from running properly.

 

New Contributor CJD
New Contributor
Posts: 2

Re: Do loop for multiple variables with different dimensions

Thank you for the answers. I now have the tools I need to continue. The last SD standard deviation comes from sd(x3) = x3 * ((sd(x1)/x1)^2+(sd(x2)/x2)^2)0.5 when x3=x1/x2.  In any case you have been very helpful and I am thankful.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 111 views
  • 0 likes
  • 2 in conversation