Here is my code so far:
libname mtsu '/courses/ddb976e5ba27fe300/PSY6560/';
run;
data cleansurvey2011;
set mtsu.survey2011;
array dem [8];
array anx [50];
array dep [18];
do i = 1 to 8;
do i = 8 to 50;
do i = 50 to 68;
if dem[i] = . then dem[i] = (missing data);
if anx[i] = . then anx[i] = (missing data);
if dep[i]= . then dep[i] = (missing data);
depreversed = dep18+1 - dep1;
end;
end;
end;
avg = anx;
avg = dep;
if Dem1 = 0 then Feedback = 'Male';
else if Dem1 = 1 then Feedback = 'Female';
if Dem2 = 0 then Feedback = 'No';
else if Dem2 = 1 then Feedback = 'Yes';
if Dem3 = 0 then Feedback = 'No';
else if Dem3 = 1 then Feedback = 'Yes';
if Dem4 = 0 then Feedback = 'No';
else if Dem4 = 1 then Feedback = 'Yes';
if Dem5 = 0 then Feedback = 'No';
else if Dem5 = 1 then Feedback = 'Yes';
if Dem6 = 0 then Feedback = 'No';
else if Dem6 = 1 then Feedback = 'Yes';
if Dem7 = 0 then Feedback = 'No';
else if Dem7 = 1 then Feedback = 'Yes';
if Dem8 = 0 then Feedback = 'No';
else if Dem8 = 1 then Feedback = 'Yes';
proc contents data = mtsu.survey2011;
run;
proc freq data = mtsu.survey2011;
tables dem;
run;
proc print data = cleansurvey2011;
var anx dep;
run;
Here is what I am trying to do:
A SAS data set called survey2011is stored in the '/courses/ddb976e5ba27fe300/PSY6560/' folder in SAS Studio.
Creates a temporary data set called cleansurvey2011. The cleansurvey2011 data set should be based on the survey2011data set, but should have the following additional characteristics.
anxiety1r = 4 – anxiety1;
Variable Name |
|
Dem1 | Sex (0 = Male, 1 = Female) |
Dem2 | Do you reside in Rutherford County? (0 = No, 1 = Yes) |
Dem3 | Are you married? (0 = No, 1 = Yes) |
Dem4 | Do you have children? (0 = No, 1 = Yes) |
Dem5 | Do you own your own home? (0 = No, 1 = Yes) |
Dem6 | Do you own a vehicle? (0 = No, 1 = Yes) |
Dem7 | Do you work full-time? (0 = No, 1 = Yes) |
Dem8 | Are you currently enrolled at MTSU? (0 = No, 1 = Yes) |
Here is the Survey2011 Dataset:
Obs dem1 dem2 dem3 dem4 dem5 dem6 dem7 dem8 anx1 anx2 anx3 anx4 anx5 anx6 anx7 anx8 anx9 anx10 anx11 anx12 anx13 anx14 anx15 anx16 anx17 anx18 anx19 anx20 anx21 anx22 anx23 anx24 anx25 anx26 anx27 anx28 anx29 anx30 anx31 anx32 anx33 anx34 anx35 anx36 anx37 anx38 anx39 anx40 anx41 anx42 anx43 anx44 anx45 anx46 anx47 anx48 anx49 anx50 dep1 dep2 dep3 dep4 dep5 dep6 dep7 dep8 dep9 dep10 dep11 dep12 dep13 dep14 dep15 dep16123456789101112131415
0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | 2 | 3 | 2 | 0 | 2 | 2 | 3 | 1 | 3 | 3 | 2 | 2 | 2 | 3 | 1 | 3 | 2 | 2 | 1 | 2 | 2 | 2 | 3 | 2 | 3 | 1 | 2 | 3 | 3 | 1 | 2 | 2 | 2 | 3 | 2 | 3 | 1 | 3 | 2 | 2 | 2 | 2 | 2 | 3 | 2 | 2 | 2 | 1 | 2 | 3 | 4 | 5 | 5 | 3 | 2 | 4 | 4 | 6 | 4 | 6 | 4 | 5 | 5 | 6 | 4 |
1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 2 | 3 | 3 | 2 | 2 | 2 | 3 | 1 | 2 | 1 | 2 | 3 | 1 | 2 | 2 | 3 | 1 | 1 | 1 | 3 | 2 | 3 | 2 | 1 | 3 | 2 | 1 | 3 | 2 | 2 | 2 | 1 | 2 | 1 | 0 | 3 | 1 | 1 | 1 | 2 | 2 | 1 | 2 | 2 | 3 | 1 | 3 | 0 | 2 | 2 | 5 | 5 | 1 | 6 | 4 | 5 | 7 | 2 | 5 | 4 | 5 | 3 | 5 | 5 | 6 | 7 |
1 | 0 | 2 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 3 | 2 | 2 | 2 | 3 | 1 | 3 | 2 | 3 | 3 | 2 | 3 | 2 | 3 | 0 | 2 | 3 | 3 | 2 | 2 | 3 | 3 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 2 | 1 | 2 | 3 | 3 | 3 | 1 | 3 | 2 | 2 | 1 | 3 | 2 | 3 | 2 | 2 | 2 | 2 | 2 | 6 | 5 | 2 | 7 | 6 | 4 | 5 | 3 | 4 | 3 | 6 | 5 | 5 | 8 | 4 |
0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 3 | 1 | 2 | 2 | 2 | 1 | 2 | 2 | 1 | 1 | 2 | 2 | 2 | 3 | 1 | 3 | 1 | 2 | 2 | 2 | 1 | 3 | 3 | 0 | 1 | 2 | 1 | 3 | 2 | 2 | 0 | 3 | 2 | 2 | 2 | 1 | 2 | 2 | 3 | 3 | 1 | 2 | 2 | 1 | 2 | 3 | 1 | 2 | 3 | 2 | 2 | 4 | 2 | 5 | 5 | 4 | 4 | 6 | 5 | 6 | 5 | 4 | 5 | 3 | 7 | 4 |
1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 2 | 3 | 1 | 3 | 2 | 2 | 2 | 3 | 1 | 1 | 2 | 2 | 3 | 2 | 2 | 3 | 3 | 3 | 2 | 2 | 3 | 0 | 3 | 3 | 1 | 1 | 2 | 2 | 2 | 2 | 3 | 2 | 2 | 3 | 1 | 1 | 1 | 0 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 1 | 2 | 3 | 1 | 1 | 5 | 7 | 6 | 6 | 3 | 4 | 4 | 6 | 6 | 3 | 7 | 5 | 3 | 5 | 3 | 5 |
0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 3 | 3 | 2 | 2 | 3 | 2 | 3 | 2 | 3 | 1 | 1 | 2 | 3 | 2 | 2 | 2 | 3 | 1 | 1 | 1 | 3 | 2 | 2 | 2 | 3 | 2 | 1 | 2 | 2 | 2 | 1 | 3 | 2 | 3 | 3 | 2 | 3 | 2 | 3 | 2 | 2 | 3 | 4 | 6 | 7 | 6 | 4 | 3 | 5 | 7 | 4 | 2 | 1 | 7 | 3 | 4 | 6 | 5 |
2 | 1 | 1 | 1 | 0 | 1 | 1 | 0 | 2 | 0 | 2 | 1 | 1 | 3 | 1 | 3 | 1 | 2 | 3 | 3 | 2 | 3 | 1 | 1 | 3 | 3 | 3 | 1 | 2 | 3 | 1 | 3 | 3 | 2 | 1 | 3 | 3 | 2 | 3 | 1 | 3 | 3 | 2 | 3 | 2 | 3 | 2 | 1 | 2 | 2 | 2 | 2 | 3 | 1 | 2 | 2 | 3 | 2 | 3 | 6 | 4 | 2 | 5 | 4 | 4 | 6 | 5 | 6 | 4 | 6 | 3 | 4 | 4 | 6 |
0 | 1 | 0 | 1 | 2 | 1 | 1 | 0 | 2 | 1 | 2 | 1 | 1 | 2 | 3 | 2 | 2 | 3 | 2 | 3 | 3 | 3 | 2 | 1 | 2 | 1 | 2 | 1 | 2 | 2 | 3 | 2 | 1 | 2 | 2 | 2 | 3 | 2 | 3 | 1 | 2 | 3 | 3 | 2 | 3 | 2 | 2 | 3 | 3 | 3 | 2 | 2 | 1 | 3 | 1 | 3 | 3 | 2 | 4 | 3 | 5 | 5 | 6 | 6 | 3 | 4 | 5 | 6 | 2 | 3 | 5 | 4 | 6 | 6 |
0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 2 | 0 | 2 | 2 | 3 | 3 | 1 | 2 | 2 | 1 | 3 | 1 | 2 | 3 | 2 | 2 | 2 | 1 | 2 | 1 | 2 | 3 | 2 | 2 | 1 | 3 | 2 | 3 | 2 | 2 | 2 | 1 | 2 | 2 | 3 | 3 | 2 | 3 | 3 | 0 | 1 | 2 | 2 | 1 | 3 | 3 | 2 | 2 | 1 | 3 | 3 | 6 | 4 | 1 | 4 | 6 | 2 | 8 | 3 | 3 | 3 | 5 | 3 | 7 | 8 | 5 |
1 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | 2 | 3 | 2 | 3 | 2 | 3 | 2 | 2 | 3 | 3 | 2 | 3 | 2 | 3 | 3 | 2 | 3 | 2 | 3 | 2 | 2 | 3 | 2 | 1 | 3 | 1 | 3 | 1 | 2 | 1 | 2 | 3 | 3 | 3 | 2 | 3 | 3 | 3 | 0 | 2 | 2 | 2 | 2 | 2 | 3 | 1 | 3 | 2 | 3 | 1 | 5 | 6 | 4 | 5 | 4 | 5 | 6 | 6 | 6 | 6 | 4 | 5 | 2 | 4 | 5 | 4 |
0 | 0 | 2 | 0 | 0 | 1 | 1 | 1 | 2 | 3 | 2 | 3 | 2 | 3 | 3 | 2 | 3 | 2 | 3 | 3 | 3 | 2 | 1 | 3 | 3 | 2 | 3 | 3 | 0 | 1 | 2 | 3 | 2 | 2 | 2 | 2 | 2 | 1 | 2 | 2 | 2 | 2 | 3 | 1 | 2 | 2 | 3 | 2 | 2 | 3 | 2 | 2 | 3 | 2 | 2 | 2 | 3 | 3 | 6 | 5 | 3 | 5 | 7 | 2 | 5 | 4 | 4 | 5 | 5 | 5 | 4 | 2 | 2 | 5 |
0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 2 | 2 | 3 | 3 | 2 | 1 | 2 | 1 | 1 | 0 | 2 | 2 | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 3 | 3 | 3 | 1 | 1 | 2 | 2 | 3 | 2 | 1 | 1 | 1 | 3 | 3 | 3 | 2 | 2 | 3 | 2 | 2 | 3 | 2 | 2 | 2 | 1 | 3 | 3 | 3 | 0 | 2 | 2 | 2 | 2 | 4 | 7 | 5 | 2 | 4 | 5 | 4 | 4 | 2 | 3 | 6 | 2 | 5 | 4 |
0 | 0 | 1 | 1 | 1 | 1 | 0 | 1 | 2 | 2 | 2 | 3 | 1 | 3 | 2 | 3 | 2 | 3 | 2 | 3 | 2 | 2 | 2 | 2 | 3 | 3 | 1 | 2 | 3 | 1 | 3 | 3 | 1 | 3 | 1 | 3 | 3 | 2 | 3 | 3 | 3 | 3 | 1 | 3 | 3 | 0 | 2 | 1 | 2 | 1 | 2 | 2 | 3 | 3 | 2 | 2 | 2 | 2 | 2 | 2 | 5 | 3 | 5 | 4 | 3 | 2 | 5 | 3 | 4 | 2 | 6 | 4 | 6 | 3 |
0 | 1 | 0 | 0 | 0 | 1 | 1 | 1 | 2 | 1 | 1 | 0 | 2 | 1 | 1 | 2 | 2 | 2 | 2 | 2 | 1 | 3 | 2 | 3 | 3 | 3 | 2 | 3 | 3 | 1 | 2 | 3 | 2 | 2 | 3 | 2 | 3 | 3 | 2 | 2 | 2 | 1 | 3 | 1 | 2 | 3 | 3 | 1 | 1 | 3 | 3 | 2 | 3 | 2 | 3 | 1 | 3 | 1 | 6 | 6 | 3 | 6 | 6 | 6 | 7 | 6 | 6 | 6 | 6 | 3 | 3 | 3 | 5 | 3 |
0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 3 | 3 | 2 | 1 | 3 | 1 | 2 | 3 | 3 | 3 | 2 | 2 | 1 | 2 | 2 | 1 | 2 | 1 | 3 | 2 | 2 | 1 | 3 | 2 | 1 | 0 | 1 | 3 | 2 | 2 | 3 | 2 | 2 | 1 | 2 | 2 | 2 | 3 | 2 | 2 | 3 | 3 | 3 | 1 | 3 | 2 | 1 | 2 | 2 | 2 | 5 | 6 | 2 | 3 | 4 | 6 | 3 | 6 | 5 | 5 | 1 | 5 | 7 | 3 | 4 | 4 |
Ok. There's a lot here that you need to understand and do but the previous link I supplied regarding arrays is pretty clear and covers a lot of what you're trying to do.
I would highly suggest breaking this down into steps so you can do one thing at a time and make sure that works first.
Let's start with these two.
1. Ensure values have ONLY the values they need to
2. Apply formats
3. Apply labels
For #1 if you want to check if a value has the value you expect you can use the NOT IN () operator as indicated in your question.
You should be able to build on this code and get about half way there. Not sure if the misspelling of not in() is your typo or your professors/testor.
Post back what you have when you're closer to the other parts and what you've tried for those and I'll take another look.
proc format;
value sex_fmt
0 = 'Male'
1 = 'Female';
value YesNo_fmt
0 = 'No'
1 = 'Yes';
run;
data cleanSurvey2011;
set mtsu.Survey2011;
array dem(8) dem1-dem8;
*loop through;
do i=1 to dim(dem);
if dem(i) not in (0, 1) /*list of valid values*/
then dem(i) = .; *set to missing;
end;
*repeat for each set of variables;
*apply formats;
format dem1 sex_fmt. dem2-dem8 YesNo_fmt.;
*create labels;
label dem1 = 'Sex'
dem2 = 'Do you reside in Rutherford County';
run;
*check results, print first 5 records;
proc print data= cleanSurvey2011 (obs=5) labels noobs;
run;
Here's what I think I'm doing:
libname mtsu '/courses/ddb976e5ba27fe300/PSY6560/';
run;
/* Creates a temporary data set called cleansurvey2011. */
/* The cleansurvey2011 data set should be based on the survey2011 data set, */
/* but should have the following additional characteristics. */
data cleansurvey2011;
set mtsu.survey2011;
/* The possible values for the demographic questions are 0 and 1. */
/* Any other value should be recoded as missing data. Note. */
/* Consider using the notin( ) function. */
array dem [8];
do i = 1 to 8;
if dem[i] = . then dem[i] = (missing data);
/* The possible values for the anxiety questions are 1, 2, and 3. */
/* Any other value should be recoded as missing data. */
array anx [50];
do i = 8 to 50;
if anx[i] = . then anx[i] = (missing data);
/* The possible values for the depression questions are 1, 2, 3, 4, 5, 6, and 7. */
/* Any other value should be recoded as missing data. */
array dep [18];
do i = 50 to 68;
if dep[i]= . then dep[i] = (missing data);
/* Reverse score the following depression questions: 3, 4, 8, 9, 10, and 11. A quick way to reverse code a variable is: */
/* reversed = scalemax+1 – original */
depreversed = dep18+1 - dep1;
end;
end;
end;
/* Create an average anxiety score for each person. */
avg = anx;
/* Create an average depression score for each person. */
avg = dep;
/* Assigns permanent labels and formats to each demographic variable. */
if Dem1 = 0 then Feedback = 'Male';
else if Dem1 = 1 then Feedback = 'Female';
if Dem2 = 0 then Feedback = 'No';
else if Dem2 = 1 then Feedback = 'Yes';
if Dem3 = 0 then Feedback = 'No';
else if Dem3 = 1 then Feedback = 'Yes';
if Dem4 = 0 then Feedback = 'No';
else if Dem4 = 1 then Feedback = 'Yes';
if Dem5 = 0 then Feedback = 'No';
else if Dem5 = 1 then Feedback = 'Yes';
if Dem6 = 0 then Feedback = 'No';
else if Dem6 = 1 then Feedback = 'Yes';
if Dem7 = 0 then Feedback = 'No';
else if Dem7 = 1 then Feedback = 'Yes';
if Dem8 = 0 then Feedback = 'No';
else if Dem8 = 1 then Feedback = 'Yes';
/* Print descriptive statistics for average anxiety and average depression scores. */
/* */
/* Print frequencies for the demographic variables. */
/* */
proc contents data = mtsu.survey2011;
run;
proc freq data = mtsu.survey2011;
tables dem;
run;
proc print data = cleansurvey2011;
var anx dep;
run;
Ok. There's a lot here that you need to understand and do but the previous link I supplied regarding arrays is pretty clear and covers a lot of what you're trying to do.
I would highly suggest breaking this down into steps so you can do one thing at a time and make sure that works first.
Let's start with these two.
1. Ensure values have ONLY the values they need to
2. Apply formats
3. Apply labels
For #1 if you want to check if a value has the value you expect you can use the NOT IN () operator as indicated in your question.
You should be able to build on this code and get about half way there. Not sure if the misspelling of not in() is your typo or your professors/testor.
Post back what you have when you're closer to the other parts and what you've tried for those and I'll take another look.
proc format;
value sex_fmt
0 = 'Male'
1 = 'Female';
value YesNo_fmt
0 = 'No'
1 = 'Yes';
run;
data cleanSurvey2011;
set mtsu.Survey2011;
array dem(8) dem1-dem8;
*loop through;
do i=1 to dim(dem);
if dem(i) not in (0, 1) /*list of valid values*/
then dem(i) = .; *set to missing;
end;
*repeat for each set of variables;
*apply formats;
format dem1 sex_fmt. dem2-dem8 YesNo_fmt.;
*create labels;
label dem1 = 'Sex'
dem2 = 'Do you reside in Rutherford County';
run;
*check results, print first 5 records;
proc print data= cleanSurvey2011 (obs=5) labels noobs;
run;
This is double-plus-ungood:
do i = 1 to 8; do i = 8 to 50; do i = 50 to 68;
Except under very unusual situations (I can't think of one right now) each loop should have its own index variable
Perhaps
do i = 1 to 8; do j = 1 to 50; do k = 1 to 18;
What your three statements did was
1) set I to 1
then immediately set I to 8, then again to 50.
Since the value of I=50 the arrays dem dep cause errors with subscripts because they do not have enough elements.
So DEM would use I values, ANX would use J values, which likely should be 1 to 50, and DEP would K values of 1 to 18
I don't know what you are attempting with
if dem[i] = . then dem[i] = (missing data);
but (missing data) is syntactically incorrect.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.