The first thing that I would look is to go back to the software that collected the data and see about options on how the data is exported.
Several data collection packages I have used have options on exporting data such as this. You may find that there is an option to have that questions values exported as 14 dichotomous (0/1 coded where 1 indicates selected) variables, or 14 different variables that record the order responses were "checked".
From what I see if you have a value like
1211 you do not know if you have two responses of 12 and 11 or 3 of 1, 2 and 11. I base this statement on the shown value of 1110. Since you apparently do not have possible single values of 110 then I have to parse that value as 11 and 10. Which means that the responses are not in value order but selection order. So you do not know which value 1211 has.
You might have had other information in the raw file such that some values had leading 0, so 01, 02 instead of 1 2 which might make your shown value of 102 (which appears to be 10 and 2) originally a text value of 0102, which would allow parsing (having done so with a similar field with 45 categories).
So examine your source data and if the values has leading zeroes for the 01 type values then re-read the data as text so you have the leading 0. Then you can parse the value from left to right two characters at a time into single variables. You would also be able to parse "lettuce" as
lettuce = (index(vegQ02,'01')>0 and mod(index(vegQ02,'01'),2)=1);
which would have a 1 when the string '01' occurs in an "odd" starting spot, position 1,3, 5 etc. To avoid complications with values like 1012
data example;
input vegq02 $;
lettuce = (index(vegQ02,'01')>0 and mod(index(vegQ02,'01'),2)=1);
datalines;
01
0201
040201
1011
;
You really have to explain what your second line of IF code was supposed to do.
BTW it is a much better code scheme to use 1/0 for yes/no true/false etc. then 1/2. With a 1/0 scheme the Sum is the count of Yes/True/Present or what have you. The Mean is a percentage of Yes values.
If you look at multiple variables with the 1/0 coding scheme such as Sum(lettuce,tomato,carrot) then you get the number of response marked Yes. Range=0 can tell if they were all the same choice, max=1 would indicate at least one was chosen, min=0 at least one choice not made. With a 1/2 type coding then you pretty much have to test each value and accumulate test responses.
Another approach would be to parse the values with an array instead of a bunch of If/then/else:
data example;
input vegq02 $;
array r(14) (14*0);
do i= 1 to (length(vegq02)/2) by 2;
r[input(substr(vegq02,i,2),f2.)]=1;
end;
drop i;
datalines;
01
0201
040201
1011
;
You could assign names like "lettuce" "tomatoe" "carrot" etc to the array elements if you want. Or just assign labels to r1 to r14.
... View more