Hi,
I am still learning, please let me know my code is correct for below scenario / correct me if I am wrong:
1. Create three data sets:
Young15:people who are age 30 and younger.
Mid15: people between the age of 40 and 60 , inclusive.
NotasYoung15: people who are age 61 and older
2.Remove the variables Test3, Test4,Test5 from the Young15 dataset
3.Create New variable avgTest that is the average of the reaming Test variables for Young15 dataset
4. what is the average (mean) of the Test5 variable in the NotasYoung15 data set
My code is:
data Young15 (drop=Test3 Test4 Test5) Mid15 NotsaYoung15;
set SOUJ.input15;
if age<=39 then output Young15;
else if 40<=age<=60 then output Mid15;
else if age>=61 then output NotasYoung15;
run;
data work.Young15;
set work.Young15;
avgInc=mean(of Test: ); /*3*/
run;
data work.NotsaYoung15;
set work.NotsaYoung15;
avgInc2=mean(of Test5); /* 4*/
run;
I would like to point out that creating separate data sets for this task is just unnecessary work. All of the questions can be answered by using the original data set.
4. what is the average (mean) of the Test5 variable in the NotasYoung15 data set
To get the mean of single variable, you don't need the mean() function at all. If the question wants the "mean" on each row, the mean of a single variable in a row is simply the value of that variable. Unless of course, as it seems to me to be much more likely, #4 means that you want the mean of all records for the variable Test5, in other words the mean down the column (this is not what the mean() function does), you would need to use PROC MEANS.
Hi @souji
I totally agree with @PaigeMiller 's comment.
If the request is really to create 3 datasets, you code is good.
Just 2 mistakes at first sight:
- Replace 39 by 30 in the following statement:
if age<=39 then output Young15;
- Use the requested variable name (i.e. avgtest) for question #3:
avgInc=mean(of Test:);
- For question #4, the mean() function will calculate the mean by rows (meaning you need to specify several variables). If you specify only one, then the mean is the value of this variable. To compute the mean of a 'column' (-> a variable), you can use PROC MEANS:
proc means data= work.NotsaYoung15 mean;
var Test5;
run;
Thank You very much for both of you @ed_sas_member and @PaigeMiller
actually, typo error, it was 39
Young15: People who are age 39 and younger
still I am correct right?
if age<=39 then output Young15;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.