<It's still not clear if you want to copy two characters from the original variable,>
I want to copy 2 characters from the original variable.
Great. Then I'll stick with the solution that I posted ... no changes required.
@Astounding This code is producing the desired result, but I'm getting a warning message: "NOTE: Invalid second argument to function SUBSTR at line 180 column 18."
What am I doing wrong here?
180 new{i,j} = substr(original{i}, 2*j-1);
181 end;
182 end;
183 run;
There are 57 variables, not 44 as previously mentioned.
DATA want; 58 SET have; 59 60 array original {57} 61 _400201 62 _400202...; 118 119 array new {57,3} $ 2 120 _400201_part1-_400201_part3 121 _400202_part1-_400202_part3...; 177 178 do i=1 to 57; 179 do j=1 to 3; 180 new{i,j} = substr(original{i}, 2*j-1); 181 end; 182 end; 183 run;
That shouldn't happen. The only cause that comes to mind is that one of your 57 original variables is defined as having a length less than $5. In that case, it shouldn't really be on the list with the other variables.
Do you get a single error or multiple errors? If you can, post the line mentioned in the log.
This shows that you still have a problem, but isn't enough to diagnose it. It looks like you found 4 variables were not really long enough, and removed them from the array. That's a good start. Probably, there are more. Try running a PROC CONTENTS on your incoming data set, and examining the report about the variables and their lengths. It is likely there are more variables that should be removed. If that doesn't tell you enough, you'll need to post more of the log ... the complete log from that DATA step, as well as the complete set of messages that are generated for that DATA step.
Actually, I didn't exclude those 4 variables b/c of their length, but the lengths of the variables in the array are different looking at the PROC CONTENTS. Do the lengths all have to be the same? The informat lengths? Neither? Should I set the length with a LENGTH statement further up the DATA step?
Well, the length of the variables in the array has to be at least $5 for the program to stop giving you messages. And it has to be at least $6 for your program to make sense. After all, you wanted to copy the 5th and 6th characters into a new variable. For that to make sense, your variable has to be at least 6 characters long.
I used a LENGTH statement after the DATA statement and the WARNING was eliminated. I am seriously grateful for all your help in this forum!
@Astounding I was hoping I could follow up on this solution. While the syntax works great, I realized that there is one problem: The value for "Refused to respond" is 101. This string gets substringed in to Var1=10, Var2=1, Var3="". This causes problems b/c the values of var 1 and var2 are valid values. Any suggestions on how to handle this?
I thought about adding an array that changes a 101 in the parent variable to "" in the children variables, but I don't really know how to do this. Here's my best guess, but it doesn't work:
ARRAY replace_101(*) $ _400705 _400706 _400707 _400708 _400801 _400802 _400803; ARRAY part1_3 (*) $ _400705_part1 - _400705_part3 _400706_part1 - _400706_part3 _400707_part1 - _400707_part3 _400708_part1 - _400708_part3 _400801_part1 - _400801_part3 _400802_part1 - _400802_part3 _400803_part1 - _400803_part3 DO i=1 to dim(replace_101); IF replace_101(i)="101" THEN part1_3(i)=""; END; DROP i; RUN;
If you had the data staring you in the face, and you had to solve it using pencil and paper, what would the solution look like?
I have a solution that doesn't use an array, but I would need to repeat it 54 times!
IF _400803="101" THEN DO; _400803_part1=""; _400803_part2=""; END;
I don't know how to incorporate this logic into an array that already has a DO loop. A DO loop within a DO loop? How do I include more than one action after the THEN statement?
- Grasshopper
You just need to assemble the pieces properly. Here's a variation:
do i=1 to 44;
if original{i}='101' then do;
* Do something here?;
end;
else do j=1 to 3;
new{i,j} = substr(original{i}, 2*j-1);
end;
end;
Actually I'm not sure what you want to have happen for the "101" values. The "part1" and "part2" variables already have missing values so you don't need to set them to missing. But when you have an idea of what should happen for "101", you just need to place it at the right spot.
<The "part1" and "part2" variables already have missing values so you don't need to set them to missing.>
The array that you offered in the original post parses a character string into 3 substrings of 2 characters each, assuming there are 6 characters in the string. For example, _400803=010101 becomes _400803_part1=01, _400803_part2=01, _400803_part3=01.
If the value of the original variable is 101 (i.e. Refused to answer), then _400803_part1=10, _400803_part2=1, _400803_part3=“”.
<I'm not sure what you want to have happen for the "101" values. >
I’m trying to set the value of _400803_part1 - _400803_part3 to missing if the value of _400803=101. I was attempting to just create a new array that followed your original array in the DATA step. You seem to be suggesting that I can simply plug the logic into the existing array, correct?
The Do loop of the original array is currently as follows:
DO i=1 to 54; DO j=1 to 3; new{i,j} = substr(original{i}, 2*j-1); END; END; DROP i j;
When I adapt it as follows I get a subscript error (i.e. Too few array subscripts specified for array new.).
DO i=1 to 54; if original{i}='101' then do; new {i}=""; end; DO j=1 to 3; new{i,j} = substr(original{i}, 2*j-1); END; END;
new {i,3}=“”; and new {54,3} do not produce errors, but they do not solve the problem.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.