BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PoissonIV
Calcite | Level 5

Hello!

 

I'm a grad student, and I'm working on a thesis. I have data from an online survey we administered through Redcap. The data is in English and Spanish. However, the way Redcap administered the survey and collected the data was as if the English and Spanish surveys were different sections of the same survey. So I have a bunch of participants, and each participant has variables in English & Spanish. The English speakers are missing the Spanish data, and the Spanish speakers are missing the English data. (I hope this makes sense). My question is, how do I pool all participants together? So if I were to run a frequency table on something like Education Level, the table would have both English and Spanish speakers? 

I was told that I could try using an array, but it came up with this error when I tried.

16
17 /* Numeric Arrays */
18
19 Array English (i) q2___1-q2___10 q3 q4 q6 q5 q44 q7 q8 q9 q11-q17e___9 q17f___1-q17f___7
19 ! q18-q28___5 q29 q30___1-q32___12 q33___1-q33___13 q34-q35___9 q36___1-q36___9 q37 q38 q39;
ERROR: Alphabetic prefixes for enumerated variables (q11-q17e___9) are different.
ERROR: Alphabetic prefixes for enumerated variables (q18-q28___5) are different.
ERROR: Alphabetic prefixes for enumerated variables (q30___1-q32___12) are different.
ERROR: Alphabetic prefixes for enumerated variables (q34-q35___9) are different.
20 Array Spanish (i) q2_sp___1-q2_sp___10 q3_sp q4_sp q6_sp q5_sp q44_sp q7_sp q8_sp q9_sp
20 ! q11_sp-q17e_sp___9 q17f_sp___1-q17f_sp___7 q18_sp-q28_sp___5 q29_sp q30_sp___1-q32_sp___13
20 ! q33_sp___1-q33_sp___13 q34_sp-q35_sp___9 q36_sp___1-q36_sp___9 q37_sp q38_sp q39_sp;
ERROR: Missing numeric suffix on a numbered variable list (q11_sp-q17e_sp___9).
ERROR: Missing numeric suffix on a numbered variable list (q18_sp-q28_sp___5).
ERROR: Alphabetic prefixes for enumerated variables (q30_sp___1-q32_sp___13) are different.
ERROR: Missing numeric suffix on a numbered variable list (q34_sp-q35_sp___9).
21 do i= 148;
22 if languages1=2 then Spanish (i) = English (i);
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
23
24 end;
25
26 /* Character Arrays */
27
28 Array EnglishC (i) q1 otherq2 otherq4 otherq6 otherq7 otherq8 q10 otherq17e otherq17f otherq28
28 ! otherq29 otherq32 otherq33 otherq35 otherq36 contact;
29 Array SpanishC (i) q1_sp otherq2_sp otherq4_sp otherq6_sp otherq7_sp otherq8_sp q10_sp
29 ! otherq17e_sp otherq17f_sp otherq28_sp otherq29_sp otherq32_sp otherq33_sp otherq35_sp
29 ! otherq36_sp contact_sp;
30 do i= 148;
31
32 if languages1=2 then Spanish (i) = English (i);
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
33
34 end;

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Did you try the bit of code I suggested earlier?

 

I strongly suggest getting all of the values into one set of variables before attempting to recode, such as your race_eth variable.

Note that if your variables q2___1-q2___10 have more than one with the value of 1 then your Race_eth variable using this code will only reflect the value of the last variable that had the value of one. Is that the desired behavior?

 

I am not sure exactly what you were attempting with

 do i= 148;
    if languages1=2 then Spanish (i) = English (i);
end.

with a single value of i it would only process one of the variables in each array, assuming there are at least 148 variables. Were you wanting to place the values of the English array into the Spanish variable?

The error about implicit, I think this comes from Languages1, and explicit (these would be the Spanish and English array references makes me think that there is not actually an existing variable named Languages1 in the data set. So SAS thinks it might refer to another element that might belong in an array.

 

Again, recommending get all the values into one set of variables, whether the English or Spanish, then your Race_sum would be better coded as:

race_sum = sum(of q2___1 - q2___10);

Better in two senses: First the more obvious that the code is much shorter and easier to follow. Second is if any of the variables listed in a series + operations is missing then the result will be missing.

 

Your error with

70 proc freq label;
-----
22
202

is because LABEL is not a valid Procedure statement option.

 

This error:

74 proc freq ;
75 table race_sum * q2___1 * q2___2 * q2___3 * q2___4 * q2___5 * q2___6 * q2___7 * q2___8 *
75 ! q2___9 * q2___10/list missing;
ERROR: Variable RACE_SUM not found.
76 run;

Is because you have the Race_sum (and race_eth) assignment code commented out, appearing between /* */ in the data step. So those statements do not execute and the variables are not calculated.

 

Hint for the long run: Even though SAS will use the last data set to run most procedures with you really want to get into a habit of specifying which exact data set to use. There are times when for debugging purposes you may intend to run proc freq or another procedure against a specific set but forget that it was not the last one created, which is the one used. You can end up spending a lot of time trying to figure out where 3 records went or why variable XXX is not re-coding correctly when the set you think Proc Freq is using is AAA but data set BBB was the last created.

View solution in original post

5 REPLIES 5
ballardw
Super User

Did the data come in one file or two?

How did you read that data into SAS? As in what code was used?

Are all of the question variables of the same type? If not you need to split numeric from character.

When the variables are not actually sequentially numbered: example q11-q17e___9  you have to separate out the Q11, Q12 (if any) Q13 Q14 etc or use the double --  such as: q11--q17e___9  to indicate the columns are sequential, not the variable names.

 

You can use the COALESCE, or if the variables a character values COALESCEC to select the value into one variable.

If you don't run into problems then code such as the following will move all the Spanish responses into the English variables. If your data is character, the as I say above COALESCEC or you may need two variable.

data want; 
   set have;
   Array English (*) q2___1-q2___10 q3 q4 q6 q5 q44 q7 q8 q9 q11--q17e___9 q17f___1-q17f___7
         q18--q28___5 q29 q30___1-q32___12 q33___1-q33___13 q34--q35___9 q36___1-q36___9 q37 q38 q39;
   Array Spanish (*) q2_sp___1-q2_sp___10 q3_sp q4_sp q6_sp q5_sp q44_sp q7_sp q8_sp q9_sp
         q11_sp--q17e_sp___9 q17f_sp___1-q17f_sp___7 q18_sp--q28_sp___5 q29_sp q30_sp___1-q32_sp___13
         q33_sp___1-q33_sp___13 q34_sp--q35_sp___9 q36_sp___1-q36_sp___9 q37_sp q38_sp q39_sp;
   do i=1 to dim(English);
      English(i) = Coalesce(English(i),Spanish(i));
   end;
run;

Danger: Do not reuse the same data set name on the Data and Set statements. Overwriting data sets when recoding data this way may cause loss of values if there is a logic problem.

AFTER you have verified that the data step does what is needed then you could rerun the step and drop all the Spanish variables (or the other way around if preferred).

 

Probably if this had been my project I would have written code that made sure the variable names were the same for English and Spanish, did not have all those extra underscore characters and possibly a few other things.

 

 

PoissonIV
Calcite | Level 5
Hi! The data is from 1 file. Here is my whole log if it helps. I used the libname to call in. Thank you for your advice so far! I will try it! Im very new at using Redcap and SAS so to be honest I had no idea what I was doing when I was making the survey in the first place... haha hindsight is 2020! 🙂

NOTE: Unable to open SASUSER.PROFILE. WORK.PROFILE will be opened instead.
NOTE: All profile changes will be lost at the end of the session.
1 libname hpv "G:\Shared drives\Can Prevent HPV Participant\Quant Data\SAS Data\HPV Quant -
1 ! Interns";
NOTE: Libref HPV was successfully assigned as follows:
Engine: V9
Physical Name: G:\Shared drives\Can Prevent HPV Participant\Quant Data\SAS Data\HPV Quant -
Interns
2 OPTIONS nofmterr;
3 OPTIONS FMTSEARCH = (hpv.hpvformats);
4
5 data ParisAnalysis; *THIS IS CURRENTLY A TERMPORAY FILE. IF YOU WANT TO CREATE A PERMANENT ONE
5 ! THAT HAS NEW VARIABLES,
6 DELETED VARIABLES, DELETED CASES, THEN WRITE "data hpv.NameOfNewDataSet";
7 set hpv.deidhpv;
8 /*** RECODE VARIABLES HERE ***/
9 /*** DELETE VARIABLES WE DONT NEED - ESP PERSONAL IDENTIFYING ONES ***/
10 /*** DELETER CASES WE DONT WANT - E.G., TOO YOUNG, TOO OLD ***/
11
12 if 0<=q1<18 then delete;
13 if 30<q1<100 then delete;
14 if 0<=q1_sp<18 then delete;
15 if 30<q1_sp<100 then delete;
16
17 /* Numeric Arrays */
18
19 Array English (i) q2___1-q2___10 q3 q4 q6 q5 q44 q7 q8 q9 q11-q17e___9 q17f___1-q17f___7
19 ! q18-q28___5 q29 q30___1-q32___12 q33___1-q33___13 q34-q35___9 q36___1-q36___9 q37 q38 q39;
ERROR: Alphabetic prefixes for enumerated variables (q11-q17e___9) are different.
ERROR: Alphabetic prefixes for enumerated variables (q18-q28___5) are different.
ERROR: Alphabetic prefixes for enumerated variables (q30___1-q32___12) are different.
ERROR: Alphabetic prefixes for enumerated variables (q34-q35___9) are different.
20 Array Spanish (i) q2_sp___1-q2_sp___10 q3_sp q4_sp q6_sp q5_sp q44_sp q7_sp q8_sp q9_sp
20 ! q11_sp-q17e_sp___9 q17f_sp___1-q17f_sp___7 q18_sp-q28_sp___5 q29_sp q30_sp___1-q32_sp___13
20 ! q33_sp___1-q33_sp___13 q34_sp-q35_sp___9 q36_sp___1-q36_sp___9 q37_sp q38_sp q39_sp;
ERROR: Missing numeric suffix on a numbered variable list (q11_sp-q17e_sp___9).
ERROR: Missing numeric suffix on a numbered variable list (q18_sp-q28_sp___5).
ERROR: Alphabetic prefixes for enumerated variables (q30_sp___1-q32_sp___13) are different.
ERROR: Missing numeric suffix on a numbered variable list (q34_sp-q35_sp___9).
21 do i= 148;
22 if languages1=2 then Spanish (i) = English (i);
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
23
24 end;
25
26 /* Character Arrays */
27
28 Array EnglishC (i) q1 otherq2 otherq4 otherq6 otherq7 otherq8 q10 otherq17e otherq17f otherq28
28 ! otherq29 otherq32 otherq33 otherq35 otherq36 contact;
29 Array SpanishC (i) q1_sp otherq2_sp otherq4_sp otherq6_sp otherq7_sp otherq8_sp q10_sp
29 ! otherq17e_sp otherq17f_sp otherq28_sp otherq29_sp otherq32_sp otherq33_sp otherq35_sp
29 ! otherq36_sp contact_sp;
30 do i= 148;
31
32 if languages1=2 then Spanish (i) = English (i);
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
ERROR: Mixing of implicit and explicit array subscripting is not allowed.
33
34 end;
35
36
37 /*race_sum = q2___1 + q2___2 + q2___3 + q2___4 + q2___5 + q2___6 + q2___7 + q2___8 + q2___9 +
37 ! q2___10;
38
39
40 if q2___1 = 1 then race_eth = 1; else
41 if q2___2 = 1 then race_eth = 2; else
42 if q2___3 = 1 then race_eth = 3; else
43 if q2___4 = 1 then race_eth = 4; else
44 if q2___5 = 1 then race_eth = 5; else
45 if q2___6 = 1 then race_eth = 6; else
46 if q2___7 = 1 then race_eth = 7; else
47 if q2___8 = 1 then race_eth = 8; else
48 if q2___9 = 1 then race_eth = 9; else
49 if q2___10 = 1 then race_eth = 10; else
50
51 if q2_sp___1 = 1 then race_eth = 1; else
52 if q2_sp___2 = 1 then race_eth = 2; else
53 if q2_sp___3 = 1 then race_eth = 3; else
54 if q2_sp___4= 1 then race_eth = 4; else
55 if q2_sp___5 = 1 then race_eth = 5; else
56 if q2_sp___6= 1 then race_eth = 6; else
57 if q2_sp___7 = 1 then race_eth = 7; else
58 if q2_sp___8 = 1 then race_eth = 8; else
59 if q2_sp___9 = 1 then race_eth = 9; else
60 if q2_sp___10= 1 then race_eth = 10;*\
61
62 run;
63
64 /*** DO PROCEDURES HERE ***/
65
66

NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
12:7 13:7 14:7 15:7
NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WORK.PARISANALYSIS may be incomplete. When this step was stopped there were
0 observations and 370 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.00 seconds



67 Proc contents varnum;
NOTE: Writing HTML Body file: sashtml.htm
68 run;

NOTE: PROCEDURE CONTENTS used (Total process time):
real time 0.54 seconds
cpu time 0.45 seconds


69
70 proc freq label;
-----
22
202
ERROR 22-322: Syntax error, expecting one of the following: ;, COMPRESS, DATA, FC, FORMCHAR,
NLEVELS, NOPRINT, ORDER, PAGE.
ERROR 202-322: The option or parameter is not recognized and will be ignored.
71 table race_eth q1 * q1_sp/missing;
72 run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

73


74 proc freq ;
75 table race_sum * q2___1 * q2___2 * q2___3 * q2___4 * q2___5 * q2___6 * q2___7 * q2___8 *
75 ! q2___9 * q2___10/list missing;
ERROR: Variable RACE_SUM not found.
76 run;

NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds

77


78 proc freq;
79 table language_selectionse_v_0 languages1 language_selectionse_v_1;
80 run;

NOTE: No observations in data set WORK.PARISANALYSIS.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.00 seconds
cpu time 0.01 seconds





ballardw
Super User

Did you try the bit of code I suggested earlier?

 

I strongly suggest getting all of the values into one set of variables before attempting to recode, such as your race_eth variable.

Note that if your variables q2___1-q2___10 have more than one with the value of 1 then your Race_eth variable using this code will only reflect the value of the last variable that had the value of one. Is that the desired behavior?

 

I am not sure exactly what you were attempting with

 do i= 148;
    if languages1=2 then Spanish (i) = English (i);
end.

with a single value of i it would only process one of the variables in each array, assuming there are at least 148 variables. Were you wanting to place the values of the English array into the Spanish variable?

The error about implicit, I think this comes from Languages1, and explicit (these would be the Spanish and English array references makes me think that there is not actually an existing variable named Languages1 in the data set. So SAS thinks it might refer to another element that might belong in an array.

 

Again, recommending get all the values into one set of variables, whether the English or Spanish, then your Race_sum would be better coded as:

race_sum = sum(of q2___1 - q2___10);

Better in two senses: First the more obvious that the code is much shorter and easier to follow. Second is if any of the variables listed in a series + operations is missing then the result will be missing.

 

Your error with

70 proc freq label;
-----
22
202

is because LABEL is not a valid Procedure statement option.

 

This error:

74 proc freq ;
75 table race_sum * q2___1 * q2___2 * q2___3 * q2___4 * q2___5 * q2___6 * q2___7 * q2___8 *
75 ! q2___9 * q2___10/list missing;
ERROR: Variable RACE_SUM not found.
76 run;

Is because you have the Race_sum (and race_eth) assignment code commented out, appearing between /* */ in the data step. So those statements do not execute and the variables are not calculated.

 

Hint for the long run: Even though SAS will use the last data set to run most procedures with you really want to get into a habit of specifying which exact data set to use. There are times when for debugging purposes you may intend to run proc freq or another procedure against a specific set but forget that it was not the last one created, which is the one used. You can end up spending a lot of time trying to figure out where 3 records went or why variable XXX is not re-coding correctly when the set you think Proc Freq is using is AAA but data set BBB was the last created.

ChrisNZ
Tourmaline | Level 20

show us the line of code that generates

19 Array English (i) q2___1-q2___10 q3 q4 q6 q5 q44 q7 q8 q9 q11-q17e___9 q17f___1-q17f___7
19 ! q18-q28___5 q29 q30___1-q32___12 q33___1-q33___13 q34-q35___9 q36___1-q36___9 q37 q38 q39;

I suspect it might contain a macro variable?

 

In any case the code makes assumptions on variable names, and these assumptions (that the names are suffixed with constant increments) is not (or no longer) valid.

q11-q17e___9

is not a valid suite of names (as clearly indicated by the log message).

 

Bottom line: Fix the list of variables list as array elements.

 

If they are contiguous in the table, it may be as simple as using a double hyphen:

q11--q17e___9

 

Also note that 
Array English (i)

is invalid syntax. So this log looks very suspicious.

 

Tom
Super User Tom
Super User

The use of an index variable in the definition of an array is valid SAS syntax. 

Try it.

data _null_;
  set sashelp.class (obs=3);
  put _n_=;
  array vars (index) sex name ;
  do over vars;
    put index= vars= ;
  end;
run;

They seemed to have removed the documentation, but the code still works the way it did in 1983 when I learned SAS.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1186 views
  • 1 like
  • 4 in conversation