BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
CJD
Calcite | Level 5 CJD
Calcite | Level 5

Using SAS Release 3.6(Basic Edition) in a browser with VirtualBox version 5.1.18

 

I have some data that I need to take to a mixed procedure. I am struggling with getting the data read correctly.

I realize that the answer may be already posted, but I either could not find it or understand.

As a dummy data set:

Let:

d1 = 23, d2 = 31, a1 = 1, a2 = 2, b1=3, b2 = 6, c1 = 5, c2 = 10

Which would yield:

a = 1.5, b = 4.5, c = 7.5, d = 27

needed answer is (a+b+c)/d = 0.5

SDa = 0.707, SDb = 2.12, SDc = 3.53, and SDd = 5.66

the SD of (a+b+c) = 4.18 and the SD of the needed answer is 0.187

 

So I tried some code:

Data one; */ order=data;
input Variety $ Treatment $ @;
Do original = 1 to 2;
    input d @; output;
end;
Do alpha = 1 to 2;
    input a @; output;
end;
Do beta = 1 to 2;
    input b @; output;
end;
Do gamma = 1 to 2;
    input c @; output;
end;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
proc print data=one;

 

which is not what I was hoping for --> hoping for something like this.

Obs Variety Treatment Original d Alpha a Beta b Gamma c

1      Var       Treat         1          23   1    1   1    3     1        5

2      Var       Treat         2          31   2     2   2   6     2        10

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

You describe operations in terms of variables named a1, a2, b1, b2, c1, c2, d1 and d2.

And then show attempting to read data in terms of Variety and Treatment. How would variety and treatment relate to a1, a2, etc?

And then create Alpha Beta Gamma withour an actual reference and it over all really isn't easy to figure out what you are doing or wanting.

 

Here is one way to read similar data and then do a transform to get data that looks like your set you say you want:

data have;
 input Var $ Treat $ d1 d2 a1 a2 b1 b2 c1 c2;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
run;

data want;
   set have;
   array _d d1-d2;
   array _a a1-a2;
   array _b b1-b2;
   array _c c1-c2;
   do original = 1 to dim(_d);
      d = _d[original];
      a = _a[original];
      b = _b[original];
      d = _c[original];
      Alpha=original;
      Beta =original;
      Gamma=original;
      output;
   end;
   drop  d1-d2 a1-a2 b1-b2 c1-c2;
run;

If you are looking for a general approach handling more than two levels of "original" this gets ugly real quick and not recommended. you would be much better off reading the data in as:

 

data should;
   input Variety $ Treatment $ Original d  a  b  c;
datalines;
Var   Treat   1   23   1   3 5
Var   Treat   2   31   2   6 10
;
run;

 

You also may want to think abou the purpose of 4 variables with identical values. You haven't shown a use/need for Alph Beta or Gamma.

 

It really is not clear where "SD of the needed answer is 0.187" comes from. The way you define "needed answer" there is only one value and the Standard deviation of any variable with a single value, regardless of how many times it is repeated, is 0.

 

Assuming SD is your abbreviation for standard deviation.

data summary;
   set have;
   meand = mean(d1,d2);
   meana = mean(a1,a2);
   meanb = mean(b1,b2);
   meanc = mean(c1,c2);
   needed = sum(meana,meanb,meanc)/meand;
   stda = std(a1,a2);
   stdb = std(b1,b2);
   stdc = std(c1,c2);
;
run;

is one way to get those stated values. OR take the set I labeled should into proc means or summary and get the summary statistics for a, b, c and d, then combine the means and recalc. But I still don't know what that last sd is from.

 

 

proc summary data=should ;
   var d a b c;
   output out=shouldsum mean= std= / autoname;
run;

Posting code and log messages works better in the forum if you paste them into code boxes openedu using the Forum {i} menu icon at the top of the message box. The main window will reformat text and may end up with html or other hidden characters that prevent data step code from running properly.

 

View solution in original post

2 REPLIES 2
ballardw
Super User

You describe operations in terms of variables named a1, a2, b1, b2, c1, c2, d1 and d2.

And then show attempting to read data in terms of Variety and Treatment. How would variety and treatment relate to a1, a2, etc?

And then create Alpha Beta Gamma withour an actual reference and it over all really isn't easy to figure out what you are doing or wanting.

 

Here is one way to read similar data and then do a transform to get data that looks like your set you say you want:

data have;
 input Var $ Treat $ d1 d2 a1 a2 b1 b2 c1 c2;
datalines;
Var Treat 23 31 1 2 3 6 5 10
;
run;

data want;
   set have;
   array _d d1-d2;
   array _a a1-a2;
   array _b b1-b2;
   array _c c1-c2;
   do original = 1 to dim(_d);
      d = _d[original];
      a = _a[original];
      b = _b[original];
      d = _c[original];
      Alpha=original;
      Beta =original;
      Gamma=original;
      output;
   end;
   drop  d1-d2 a1-a2 b1-b2 c1-c2;
run;

If you are looking for a general approach handling more than two levels of "original" this gets ugly real quick and not recommended. you would be much better off reading the data in as:

 

data should;
   input Variety $ Treatment $ Original d  a  b  c;
datalines;
Var   Treat   1   23   1   3 5
Var   Treat   2   31   2   6 10
;
run;

 

You also may want to think abou the purpose of 4 variables with identical values. You haven't shown a use/need for Alph Beta or Gamma.

 

It really is not clear where "SD of the needed answer is 0.187" comes from. The way you define "needed answer" there is only one value and the Standard deviation of any variable with a single value, regardless of how many times it is repeated, is 0.

 

Assuming SD is your abbreviation for standard deviation.

data summary;
   set have;
   meand = mean(d1,d2);
   meana = mean(a1,a2);
   meanb = mean(b1,b2);
   meanc = mean(c1,c2);
   needed = sum(meana,meanb,meanc)/meand;
   stda = std(a1,a2);
   stdb = std(b1,b2);
   stdc = std(c1,c2);
;
run;

is one way to get those stated values. OR take the set I labeled should into proc means or summary and get the summary statistics for a, b, c and d, then combine the means and recalc. But I still don't know what that last sd is from.

 

 

proc summary data=should ;
   var d a b c;
   output out=shouldsum mean= std= / autoname;
run;

Posting code and log messages works better in the forum if you paste them into code boxes openedu using the Forum {i} menu icon at the top of the message box. The main window will reformat text and may end up with html or other hidden characters that prevent data step code from running properly.

 

CJD
Calcite | Level 5 CJD
Calcite | Level 5

Thank you for the answers. I now have the tools I need to continue. The last SD standard deviation comes from sd(x3) = x3 * ((sd(x1)/x1)^2+(sd(x2)/x2)^2)0.5 when x3=x1/x2.  In any case you have been very helpful and I am thankful.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 566 views
  • 0 likes
  • 2 in conversation