- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I tried to do
data want;
infile 'data set';
input totsps totspm totspt totspw;
avgsp = (totsps +totspm +totspt+totspw/4);
run;
but it creates a new variable but not a new output for avgsp.
please help... i feel like my input step may be wrong
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi @sasuser94 Please post a sample of your input data and the expected output
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sasuser94 wrote:
I tried to do
data want;
infile 'data set';
input totsps totspm totspt totspw;
avgsp = (totsps +totspm +totspt+totspw/4);
run;
but it creates a new variable but not a new output for avgsp.
please help... i feel like my input step may be wrong
When you use
avgsp = ...
you are indeed creating a new variable. I don't know what you mean by "new output for avgsp", please show us what you want.
Also, that's the wrong formula for average.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you want an average for:
1. The individual row - a value for each row
2. For all rows, for all the different variables as one measurement, ie mean of totsps, spm, spt, spw - one number
3. For each variable across all rows - four numbers
I suspect it's #2, but it's not clear.
@sasuser94 wrote:
I tried to do
data want;
infile 'data set';
input totsps totspm totspt totspw;
avgsp = (totsps +totspm +totspt+totspw/4);
run;
but it creates a new variable but not a new output for avgsp.
please help... i feel like my input step may be wrong
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sasuser94 wrote:
I tried to do
data want;
infile 'data set';
input totsps totspm totspt totspw;
avgsp = (totsps +totspm +totspt+totspw/4);
run;
but it creates a new variable but not a new output for avgsp.
please help... i feel like my input step may be wrong
You attempted code is incorrect as it only divides the last variable totspq by 4.
SAS makes this much easier with functions such as
avgsp = mean(totsps, totspm, totspt, totspw);
Which will calculate the mean for a single row of data.
If you need a mean across all of the rows of data then follow up with:
Proc means data=want mean;
var avgsp;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your input. I am trying to get the average of 4 variables (totsps, totspm, totspt, totspw) and create a new variable called avgslp to get the average of the 4 days. I would like to do this for my entire data set.
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sasuser94 wrote:
Thank you for your input. I am trying to get the average of 4 variables (totsps, totspm, totspt, totspw) and create a new variable called avgslp to get the average of the 4 days. I would like to do this for my entire data set.
Well, it sure would be nice if you gave an example using a small example data set, because the above sentence can be interpreted in several ways.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@sasuser94 wrote:
I don’t get any output when I run the code to create a new variable. I want to create a new variable “avgsp” by calculating the meaning across 4 variables (totsps, totspm, totspt, totspw).... every time I try to run code in sas it says errors and the new data set has missing values. I think it may have something to do with me trying to use ‘infile’ to tell sas what data set to use.
Please post the log using the window opened by the {i} button.
EDIT: You know that the infile-statement read text-files, if you want to process a dataset you need the set-statement.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
DATA SLEEP2; infile ':\AH data.xlsx'; AVGSLP = MEAN(TOTSPSUN, TOTSPMON, TOTSPTUE, TOTSPWED); RUN; PROC MEANS DATA = WORK.PRAPCRO1; AVGSP = MEAN(TOTSPSUN, TOTSPMON, TOTSPTUE, TOTSPWED); VAR AVGSP; RUN;
how would i use a set statement?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Excel-files aren't datasets! Excel-files are read by proc import or by using a libname statement.Try:
libname source xlsx "b:/ah data.xlsx";
/* process file */
libname source clear;
/* or with proc import */
proc import datafile="b:/ah data.xlsx" dbms=xlsx out=work.have replace;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
1. You cannot really refer to an XLSX file like this. You first need to import it somehow. If you know the sheet name, this is pretty trivial. Use PROC IMPORT to import your Excel sheet first.
2. Once you have it imported, your best bet is to reformat the data using proc transpsose and then use PROC MEANS.
This will import an xlsx file
proc import out=AHDATA file = 'path to xlsx file.xlsx' dbms=xlsx replace; run;
Then use PROC TRANSPOSE to flip your data.
https://stats.idre.ucla.edu/sas/modules/how-to-reshape-data-wide-to-long-using-proc-transpose/
Then use PROC MEANS to summarize your data
https://github.com/statgeek/SAS-Tutorials/blob/master/proc_means_basic.sas
@sasuser94 wrote:
DATA SLEEP2; infile ':\AH data.xlsx'; AVGSLP = MEAN(TOTSPSUN, TOTSPMON, TOTSPTUE, TOTSPWED); RUN; PROC MEANS DATA = WORK.PRAPCRO1; AVGSP = MEAN(TOTSPSUN, TOTSPMON, TOTSPTUE, TOTSPWED); VAR AVGSP; RUN;how would i use a set statement?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Totsps. Totspm. Totspt
1. 1 8
2 5 5
2. 8 7
I’m looking to calculate the average of each row by creating a new variable avgslp
3
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
3 is not the "average of each row".
And if you really want the "average of each row", you would have three new values, one for each row.
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
From that data the only operation I can think of that would result in 3 would be to count the number of observations.
First get your data into a SAS dataset.
data have;
input Totsps Totspm Totspt ;
cards;
1 1 8
2 5 5
2 8 7
;
If you want to make a new variable then create a new dataset by reading in the current dataset and doing something.
data want;
set have;
tots_avg = mean(of totsps totspm totspt);
run;
Now you have the average of those three variables on EACH observation.
Obs Totsps Totspm Totspt tots_avg 1 1 1 8 3.33333 2 2 5 5 4.00000 3 2 8 7 5.66667
If you want the average across the whole dataset you can use PROC MEANS.
proc means data=want;
run;
The MEANS Procedure Variable N Mean Std Dev Minimum Maximum ----------------------------------------------------------------------------- Totsps 3 1.6666667 0.5773503 1.0000000 2.0000000 Totspm 3 4.6666667 3.5118846 1.0000000 8.0000000 Totspt 3 6.6666667 1.5275252 5.0000000 8.0000000 tots_avg 3 4.3333333 1.2018504 3.3333333 5.6666667 -----------------------------------------------------------------------------