I have a data set (imported from a .csv file) that has data on 80 children who had to try to identify a bunch of words. Each child has 6 variables associated with him/her at the individual level: Participant, age, gender, bilingual, esl and linguistic text. Then, for each of a bunch of words, they can get a "c", "i" or missing. So it looks something like this:
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now .....
1 79 m y y y c c c c c i
2 84 f y y y c i c i i
.
.
.
.
Now, I would like to run analyses on the likelihood of a word being correctly guessed, so I need the data like this:
Word Participant Age Gender Bilingual ESL Linguistic Correct
I 1 79 m y y y y
I 2 84 f y y y y
.......
I_ve 1 79 m y y y y
I_ve 2 84 f y y y y
...
How can I do this?
data words;
infile cards firstobs=2;
input Participant Age (Gender Bilingual ESL Linguistic I I_ve the I_m of now)(:$1.);
cards;
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now
1 79 m y y y c c c c c i
2 84 f y y y c i . c i i
;;;;
run;
proc print;
run;
proc transpose data=words out=want(rename=(col1=Response)) name=word;
by Participant Age Gender Bilingual ESL Linguistic;
var i--now;
run;
proc print;
run;
@plf515 wrote:
I have a data set (imported from a .csv file) that has data on 80 children who had to try to identify a bunch of words. Each child has 6 variables associated with him/her at the individual level: Participant, age, gender, bilingual, esl and linguistic text. Then, for each of a bunch of words, they can get a "c", "i" or missing. So it looks something like this:
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now .....
1 79 m y y y c c c c c i
2 84 f y y y c i c i i
.
.
.
.
Now, I would like to run analyses on the likelihood of a word being correctly guessed, so I need the data like this:
Word Participant Age Gender Bilingual ESL Linguistic Correct
I 1 79 m y y y y
I 2 84 f y y y y
.......
I_ve 1 79 m y y y y
I_ve 2 84 f y y y y
...
How can I do this?
First thing, it is best to post data in the form of a data step.
data have; informat Participant $4. Gender Bilingual ESL Linguistic I I_ve the I_m of now $1. ; input Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now ; datalines; 1 79 m y y y c c c c c i 2 84 f y y y c i . c i i ; run; proc transpose data =have out=want name=word; by participant age gender bilingual esl linguistic; var I I_ve the I_m of now ;/* guessing these are "word" variables*/ run;
This assumes that the data is sorted by participant and other variables on the BY statement.
The output variable COL1 will have the values of the scores(?).
If you want c to display as y and I as n (note that the forum won't allow us to show a single lower case I ) then apply a custom format to the variable when used and a label for Col1 (or rename the variable as desired)
I tend to really dislike character variables for yes/no type values and prefer 1/0 numeric values as statistics are easier to calculate for certain types of output and some model procedures require dependent values to be numeric.
Use the %TRANSPOSE macro
http://support.sas.com/resources/papers/proceedings13/538-2013.pdf
data words;
infile cards firstobs=2;
input Participant Age (Gender Bilingual ESL Linguistic I I_ve the I_m of now)(:$1.);
cards;
Participant Age Gender Bilingual ESL Linguistic I I_ve the I_m of now
1 79 m y y y c c c c c i
2 84 f y y y c i . c i i
;;;;
run;
proc print;
run;
proc transpose data=words out=want(rename=(col1=Response)) name=word;
by Participant Age Gender Bilingual ESL Linguistic;
var i--now;
run;
proc print;
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.