I am using an array to extract substrings from a character string. In 4 of the 5 datasets, the string begins w/ a zero (e.g. 010101). In the other dataset, there is no leading zero (e.g. 10101). This causes problems for my array.
How can I add a zero to the beginning of the character string for multiple variables? Another array no doubt...
Thank you!
DATA want_1; INPUT var1 $ var2 $; DATALINES; 10101 0 10101 100000 0 10100 10101 0 100 101 ; RUN; data temp; set want_1; array v{*} $ var1-var2 ; length temp $ 6 vname $ 40 value $ 2 kname $ 40; n+1; do i=1 to dim(v); temp=v{i}; if temp not in ('100' '101') then temp=translate(right(temp),'0',' '); else if temp='100' then temp='0'||temp; /*<----*/ else if temp='101' then call missing(temp); vname=vname(v{i}); k=0; do j=1 to 6 by 2; k+1; kname=cats('Part',k); value=substr(temp,j,2); output; end; end; drop i j ; run; proc transpose data=temp out=want(drop=_name_) delimiter=_; by n ; var value; id vname kname; run; data want; merge want_1 want; run;
DATA want_1; INPUT var1 $ var2 $; DATALINES; 10101 0 10101 100000 0 10100 10101 0 ; RUN;
DATA want_2; SET want_1; ARRAY original {2} Var1 Var2; ARRAY new {2,3} $ 2 var1_part1 - _var1_part3 var2_part1 - _var2_part3; DO i=1 to 2; DO j=1 to 3; new{i,j} = substr(original{i}, 2*j-1); END; DROP i j; RUN;
The character string is supposed to be a maximum of 6 characters. The possible values are: 01 or 00 for each two digit pair. So, 010100 would be parsed into 3 new variables. Var1=01, Var2=01, Var3=00.
In this one dataset, there is no leading zero, so the maximum string is 5 characters long. If the array substrings the first 2 characters, it will take 10, instead of 01.
Thanks for your help @FriedEgg
length str $6;
DO i=1 to 2;
str = put(input(original{i}, best.), z6.);
DO j=1 to 3;
new{i,j} = substr(str, 2*j-1);
END;
END;
DATA have;
INPUT var $ @@;
DATALINES;
10101 0
10101 100000
0 10100
10101 0
;
RUN;
DATA want;
SET have;
ARRAY xfr[3] $2 var_part1 - var_part3;
CALL POKELONG( REPEAT('0',5) , ADDRLONG(xfr[1]) , 6 );
CALL POKELONG( var , PTRLONGADD( ADDRLONG(xfr[1]) , 6 - LENGTHN(var) ) , LENGTHN(var) );
RUN;
@FriedEgg Thanks for the assistance. A quick clarifying question:
<INPUT var $ @@;>
I'm not familiar with this convention. When do the "@" symbols represent?
Also, I'm using SAS University Edition. I received the following error when I ran this code.
DATA a2_pershealth_destring_qwb_2; SET a2_pershealth_replace_nulls; ARRAY xfr[3] $2 _400803_part1 - _400803_part3; CALL POKELONG( REPEAT('0',5) , ADDRLONG(xfr[1]) , 6 ); CALL POKELONG( var , PTRLONGADD( ADDRLONG(xfr[1]) , 6 - LENGTHN(_400803) ) , LENGTHN(_400803) ); RUN;
1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
55
56 DATA a2_pershealth_destring_qwb_2;
57 SET a2_pershealth_replace_nulls;
58
59 ARRAY xfr[3] $2 _400803_part1-_400803_part3;
60
61 CALL POKELONG( REPEAT('0',5) , ADDRLONG(xfr[1]) , 6 );
________ ________
251 68
ERROR: The function POKELONG cannot be invoked when SAS is in the lockdown state.
ERROR: The function ADDRLONG cannot be invoked when SAS is in the lockdown state.
ERROR 251-185: The subroutine POKELONG is unknown, or cannot be accessed. Check your spelling.
Either it was not found in the path(s) of executable images, or there was incorrect or missing subroutine descriptor
information.
ERROR 68-185: The function ADDRLONG is unknown, or cannot be accessed.
62
63 CALL POKELONG( var , PTRLONGADD( ADDRLONG(xfr[1]) , 6 - LENGTHN(_400803) ) , LENGTHN(_400803) );
________ ________
251 68
ERROR: The function POKELONG cannot be invoked when SAS is in the lockdown state.
ERROR: The function ADDRLONG cannot be invoked when SAS is in the lockdown state.
ERROR 251-185: The subroutine POKELONG is unknown, or cannot be accessed. Check your spelling.
Either it was not found in the path(s) of executable images, or there was incorrect or missing subroutine descriptor
information.
ERROR 68-185: The function ADDRLONG is unknown, or cannot be accessed.
64 RUN;
Just a guess :
data substrings;
/* ... */
set ds1 ds2 ds3(in=special) ds4 ds5;
if special then theString = cats("0", theString);
/* ... Extract substrings */
(untested )
Is it what you are looking for ?
DATA want_1;
INPUT var1 $ var2 $;
DATALINES;
10101 0
10101 100000
0 10100
10101 0
;
RUN;
data want;
set want_1;
array v{*} $ var: ;
length temp $ 6;
do i=1 to dim(v);
temp=v{i};
temp=translate(right(temp),'0',' ');
v{i}=temp;
end;
drop temp i;
run;
@Ksharp Thanks for this. It does the trick, to some degree.
A value of 10100 is converted to 01010 (i.e. 01 01 0).
How could I adjust this code to produce a 6 digit string? Ideally, 10100 would be converted to 010100.
Also, a value of 10101 is converted to 01010 (i.e. 01 01 0), when it should be 01 01 01.
Thanks again.
Yes. Chang length as LENGTH temp $ 6 ;
If you want 6 digit string.
@Ksharp The length already set to 6, no?
The current syntax is changing 10101 to 01010, which then becomes _400802_part1=01, _400802_part2=01, _400802_part3=0. _400802_part3 should equal 01.
Here's the data before the syntax:
PatientID | _400802 | _400802_part1 | _400802_part2 | _400802_part3 |
---|---|---|---|---|
140 | 10101 | 10 | 10 |
1 |
Here's the data after the syntax:
PatientID | _400802 | _400802_part1 | _400802_part2 | _400802_part3 |
---|---|---|---|---|
140 | 01010 | 01 | 01 | 0 |
I'm sorry I don't understand your syntax well enough to trouble shoot this myself! Thanks again for your help.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.