I have a data set (imported from a .csv file) that has data on 80 children who had to try to identify a bunch of words. Each child has 6 variables associated with him/her at the individual level: Participant, age, gender, bilingual, esl and linguistic text. Then, for each of a bunch of words, they can get a "c", "i" or missing. So it looks something like this:
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now .....
1 79 m y y y c c c c c i
2 84 f y y y c i c i i
.
.
.
.
Now, I would like to run analyses on the likelihood of a word being correctly guessed, so I need the data like this:
Word Participant Age Gender Bilingual ESL Linguistic Correct
I 1 79 m y y y y
I 2 84 f y y y y
.......
I_ve 1 79 m y y y y
I_ve 2 84 f y y y y
...
How can I do this?
data words;
infile cards firstobs=2;
input Participant Age (Gender Bilingual ESL Linguistic I I_ve the I_m of now)(:$1.);
cards;
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now
1 79 m y y y c c c c c i
2 84 f y y y c i . c i i
;;;;
run;
proc print;
run;
proc transpose data=words out=want(rename=(col1=Response)) name=word;
by Participant Age Gender Bilingual ESL Linguistic;
var i--now;
run;
proc print;
run;
@plf515 wrote:
I have a data set (imported from a .csv file) that has data on 80 children who had to try to identify a bunch of words. Each child has 6 variables associated with him/her at the individual level: Participant, age, gender, bilingual, esl and linguistic text. Then, for each of a bunch of words, they can get a "c", "i" or missing. So it looks something like this:
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now .....
1 79 m y y y c c c c c i
2 84 f y y y c i c i i
.
.
.
.
Now, I would like to run analyses on the likelihood of a word being correctly guessed, so I need the data like this:
Word Participant Age Gender Bilingual ESL Linguistic Correct
I 1 79 m y y y y
I 2 84 f y y y y
.......
I_ve 1 79 m y y y y
I_ve 2 84 f y y y y
...
How can I do this?
First thing, it is best to post data in the form of a data step.
data have; informat Participant $4. Gender Bilingual ESL Linguistic I I_ve the I_m of now $1. ; input Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now ; datalines; 1 79 m y y y c c c c c i 2 84 f y y y c i . c i i ; run; proc transpose data =have out=want name=word; by participant age gender bilingual esl linguistic; var I I_ve the I_m of now ;/* guessing these are "word" variables*/ run;
This assumes that the data is sorted by participant and other variables on the BY statement.
The output variable COL1 will have the values of the scores(?).
If you want c to display as y and I as n (note that the forum won't allow us to show a single lower case I ) then apply a custom format to the variable when used and a label for Col1 (or rename the variable as desired)
I tend to really dislike character variables for yes/no type values and prefer 1/0 numeric values as statistics are easier to calculate for certain types of output and some model procedures require dependent values to be numeric.
Use the %TRANSPOSE macro
http://support.sas.com/resources/papers/proceedings13/538-2013.pdf
data words;
infile cards firstobs=2;
input Participant Age (Gender Bilingual ESL Linguistic I I_ve the I_m of now)(:$1.);
cards;
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now
1 79 m y y y c c c c c i
2 84 f y y y c i . c i i
;;;;
run;
proc print;
run;
proc transpose data=words out=want(rename=(col1=Response)) name=word;
by Participant Age Gender Bilingual ESL Linguistic;
var i--now;
run;
proc print;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.