Subsetting and Creating new variables

Reply
Occasional Contributor
Posts: 13

Subsetting and Creating new variables

The data set attached has 31 variables; each variable is the response to a quiz item and is coded 1=true and 2=false. Missing data are coded as either 0 or 9. The first 17 variables are responses to a pretest. After, a posttest was given.  Only the first 14 items on the pretest were repeated on the posttest, so there are a total of 31 variables (17 pretest repsonse, 14 post). One attachment is the .txt and the other is a word doc so you can see the layout slightly better.

1) How do I read the data into SAS and create a permanent SAS data set? 

2) The correct answers are:  1. True  2. True 3. False 4. False 5. True 6. True 7. False 8. False 9. False 10. True 11. True 12. True 13. True 14. True 15. True 16. True 17. True

Create two new variables, one for the proportion of correct responses on the pretest and one for the proportion of correct responses on the posttest. Missing values should be considered an incorrect answer. 

3) Conduct an appropriate summary statistics to determine whether scores on the post test or pretest were higher. 

 

Here's what I attempted:

LIBNAME MyLib "D:\Statistical Data Management/MyLib";
data mylib.DataHW4;
infile 'D:\Statistical Data Management\MyLib\Homework\Assignment #4\DataHW4.txt'
delimiter = '09'x;
input ID 1-12 PreTest 59-75;
run;
data mylib.DataHW4PreTest;
SET mylib.dataHW4;
array X[1] PreTest;
do i=59 to 75;
when(1) answer = 'true';
when(2) answer = 'false';
end;
output;
end;
run;

 

that didn't get me far, not even past number 1. Any help is appreciated!

Super User
Super User
Posts: 8,609

Re: Subsetting and Creating new variables

Do you need to decode the values?  In SAS you can apply a format to variables to display something different, e.g.:

libname mylib "D:\Statistical Data Management/MyLib";

proc format;
  value yn
    1="true"
    2="false";
run;

data mylib.datahw4;
  infile 'D:\Statistical Data Management\MyLib\Homework\Assignment #4\DataHW4.txt' delimiter = '09'x;
  input ID 1-12 pretest 59-75;
  array pt{31};
  do i=1 to 31;
    pt{i}=input(char(pretest,i),1.);
  end;
  format pt: yn.;
run;

This creates a format to be applied, then in the datastep spits out each character to a 1 or 0 numeric and applies the format to all pt... variables (the : means any variable with prefix of pt).

Occasional Contributor
Posts: 13

Re: Subsetting and Creating new variables

hm okay, then how would I add a missing values statement? and since this is thought of like a quiz, no response means it is wrong. How would I write a statement for that? Would that involve creating new variables and/or subsetting?

Respected Advisor
Posts: 4,274

Re: Subsetting and Creating new variables

[ Edited ]

@trash

It's often beneficial to code True as 1, False as 0 and everything else as Missing.

If you have to differentiate between different sorts of/reasons for Missing then you can use special missing values.

http://documentation.sas.com/?docsetId=lrcon&docsetTarget=p1xr9fm7y8kek5n1hpj008tnu1a1.htm&docsetVer...

 

The following code uses a custom informat which recodes your source values when reading into SAS.

The custom format then displays your values as True, False or Missing (do not confuse the internal values with what you see).

 

Because True is now 1, False 0 and everything else is missing, we can simply divide the sum answers codes through the total number of answers to populate your proportional variables.

proc format;
  invalue OneZeroMiss (default=1)
    1=1
    2=0
    0=.z
    9=.n
    other=.
    ;
  value TrueFalseMiss (default=7)
    1='True'
    0='False'
    other='Missing'
    ;
run;

data work.datahw4;
  attrib group_id length=$12;
  attrib prop_pretest prop_posttest format=percent8.1;

  array pretest_  {16} 3;
  array posttest_ {14} 3;
  attrib pretest_:  format=TrueFalseMiss.;
  attrib posttest_: format=TrueFalseMiss.;

  infile 'c:\temp\DataHW4 - Copy.txt' firstobs=5 truncover;
  input 
    group_id $ 1-12 
    @59 (pretest_[*])  (OneZeroMiss1.) 
    @79 (posttest_[*]) (OneZeroMiss1.)
    ;

  prop_pretest  =sum(of pretest_[*])/dim(pretest_);
  prop_posttest =sum(of posttest_[*])/dim(posttest_);

run;

 

Occasional Contributor
Posts: 13

Re: Subsetting and Creating new variables

This is very helpful, thank you!!! Just one question...how does it know the correct answers? without those, the proportions would be wrong, I would think. Sorry, I'm very new to SAS!!! 

Respected Advisor
Posts: 4,274

Re: Subsetting and Creating new variables

@trash

"...how does it know the correct answers"

The correct answers are now coded with an internal value of 1, everything else is either 0 or Missing. With such codes we can just use the sum() function and the returned value is the number of correct answers.

Super Contributor
Posts: 407

Re: Subsetting and Creating new variables

proc format;
value yn
1="true"
2,.="false";
run;

Untested. Reading the documentation is highly recommend, especially the sections explaining the basic concepts of the SAS-language.
Ask a Question
Discussion stats
  • 6 replies
  • 194 views
  • 1 like
  • 4 in conversation