BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
_maldini_
Barite | Level 11

I am using an array to extract substrings from a character string. In 4 of the 5 datasets, the string begins w/ a zero (e.g. 010101). In the other dataset, there is no leading zero (e.g. 10101). This causes problems for my array.

 

How can I add a zero to the beginning of the character string for multiple variables? Another array no doubt...

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
OK. Here is .
DATA want_1;
   INPUT var1 $ var2 $;
   DATALINES;
   10101      0 
   10101     100000 
   0         10100 
   10101      0
   100   101
;
RUN;
data temp;
 set want_1;
 array v{*} $ var1-var2 ;
 length temp $ 6 vname $ 40 value $ 2 kname $ 40;
 n+1;
 do i=1 to dim(v);
  temp=v{i};
  if temp not in ('100' '101') then temp=translate(right(temp),'0',' '); 
   else if temp='100' then temp='0'||temp; /*<----*/
     else if temp='101' then call missing(temp); 
  vname=vname(v{i});
  k=0;
  do j=1 to 6 by 2;
    k+1;
    kname=cats('Part',k);
	value=substr(temp,j,2);
	output;
   end;
 end;
drop i j ;
run;
proc transpose data=temp out=want(drop=_name_) delimiter=_;
 by n ;
 var value;
 id vname kname;
run;
data want;
 merge want_1 want;
run;

View solution in original post

21 REPLIES 21
FriedEgg
SAS Employee
Please provide a more clear picture of you problem by sharing the 'array' code you are having issues with along with a set of sample data
_maldini_
Barite | Level 11
DATA want_1;
   INPUT var1 $ var2 $;
   DATALINES;
   10101      0 
   10101     100000 
   0         10100 
   10101      0
;
RUN;
DATA want_2;
	SET want_1;

	ARRAY original {2}  
	Var1 Var2;

	ARRAY new {2,3} $ 2 
	var1_part1	-  _var1_part3
	var2_part1	-  _var2_part3;

	DO i=1 to 2;
   	DO j=1 to 3;
      new{i,j} = substr(original{i}, 2*j-1);
   	END;
	
	DROP 
	i j;
RUN;

The character string is supposed to be a maximum of 6 characters. The possible values are: 01 or 00 for each two digit pair. So, 010100 would be parsed into 3 new variables. Var1=01, Var2=01, Var3=00.

 

In this one dataset, there is no leading zero, so the maximum string is 5 characters long. If the array substrings the first 2 characters, it will take 10, instead of 01. 

 

Thanks for your help @FriedEgg

PGStats
Opal | Level 21
length str $6;
DO i=1 to 2;
	str = put(input(original{i}, best.), z6.);
   	DO j=1 to 3;
		new{i,j} = substr(str, 2*j-1);
		END;   	
	END;
PG
FriedEgg
SAS Employee
DATA have;
   INPUT var $ @@;
   DATALINES;
   10101      0 
   10101     100000 
   0         10100 
   10101      0
;
RUN;

DATA want;
   SET have;
   
   ARRAY xfr[3] $2 var_part1 - var_part3;
   
   CALL POKELONG( REPEAT('0',5) , ADDRLONG(xfr[1]) , 6 );
   
   CALL POKELONG( var , PTRLONGADD( ADDRLONG(xfr[1]) , 6 - LENGTHN(var) ) , LENGTHN(var) );
RUN;
_maldini_
Barite | Level 11

@FriedEgg Thanks for the assistance. A quick clarifying question:

 

<INPUT var $ @@;>  

 

I'm not familiar with this convention. When do the "@" symbols represent?

 

Also, I'm using SAS University Edition. I received the following error when I ran this code.

 

DATA a2_pershealth_destring_qwb_2;
   SET a2_pershealth_replace_nulls;
   
   ARRAY xfr[3] $2 _400803_part1	-	_400803_part3;
   
   CALL POKELONG( REPEAT('0',5) , ADDRLONG(xfr[1]) , 6 );
   
   CALL POKELONG( var , PTRLONGADD( ADDRLONG(xfr[1]) , 6 - LENGTHN(_400803) ) , LENGTHN(_400803) );
RUN;

1 OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
55
56 DATA a2_pershealth_destring_qwb_2;
57 SET a2_pershealth_replace_nulls;
58
59 ARRAY xfr[3] $2 _400803_part1-_400803_part3;
60
61 CALL POKELONG( REPEAT('0',5) , ADDRLONG(xfr[1]) , 6 );
________ ________
251 68
ERROR: The function POKELONG cannot be invoked when SAS is in the lockdown state.
ERROR: The function ADDRLONG cannot be invoked when SAS is in the lockdown state.
ERROR 251-185: The subroutine POKELONG is unknown, or cannot be accessed. Check your spelling.
Either it was not found in the path(s) of executable images, or there was incorrect or missing subroutine descriptor
information.

ERROR 68-185: The function ADDRLONG is unknown, or cannot be accessed.

62
63 CALL POKELONG( var , PTRLONGADD( ADDRLONG(xfr[1]) , 6 - LENGTHN(_400803) ) , LENGTHN(_400803) );
________ ________
251 68
ERROR: The function POKELONG cannot be invoked when SAS is in the lockdown state.
ERROR: The function ADDRLONG cannot be invoked when SAS is in the lockdown state.
ERROR 251-185: The subroutine POKELONG is unknown, or cannot be accessed. Check your spelling.
Either it was not found in the path(s) of executable images, or there was incorrect or missing subroutine descriptor
information.

ERROR 68-185: The function ADDRLONG is unknown, or cannot be accessed.

64 RUN;

FriedEgg
SAS Employee
@_maldini_, My solution will not work with SAS University Edition. As the ERROR message states, this product ships in a 'lockdown state' which disables the usage of the APP functions I have used.
FriedEgg
SAS Employee
The @@ instructs SAS to read the input instructions repeatedly for the same line. It is stated more clearly in the INPUT statement documentation.

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000146292.htm
PGStats
Opal | Level 21

Just a guess :

 

data substrings;
/* ... */
set ds1 ds2 ds3(in=special) ds4 ds5;
if special then theString = cats("0", theString);
/* ... Extract substrings */

(untested Smiley Happy)

PG
Ksharp
Super User

Is it what you are looking for ?

 

DATA want_1;
   INPUT var1 $ var2 $;
   DATALINES;
   10101      0 
   10101     100000 
   0         10100 
   10101      0
;
RUN;
data want;
 set want_1;
 array v{*} $ var: ;
 length temp $ 6;
 do i=1 to dim(v);
  temp=v{i};
  temp=translate(right(temp),'0',' ');
  v{i}=temp;
 end;
 drop temp i;
 run;
_maldini_
Barite | Level 11

@Ksharp Thanks for this. It does the trick, to some degree.

 

A value of 10100 is converted to 01010 (i.e. 01 01 0). 

 

How could I adjust this code to produce a 6 digit string? Ideally, 10100 would be converted to 010100.

 

Also, a value of 10101 is converted to 01010 (i.e. 01 01 0), when it should be 01 01 01. 

 

Thanks again.

_maldini_
Barite | Level 11
Also, I'm fairly new to SAS and arrays. What is the purpose of the LENGTH statement (length temp $ 6;) in the array?
FriedEgg
SAS Employee
The LENGTH statement is there so that the RIGHT function will be able to justify the shorter strings to the proper place and then replace the spaces at the front of the string with '0'
Ksharp
Super User

Yes. Chang length as  LENGTH temp $ 6 ;

If you want 6 digit string.

_maldini_
Barite | Level 11

@Ksharp The length already set to 6, no?

 

The current syntax is changing  10101 to  01010, which then becomes _400802_part1=01, _400802_part2=01, _400802_part3=0. _400802_part3 should equal 01.

 

Here's the data before the syntax:

PatientID _400802 _400802_part1 _400802_part2 _400802_part3
140 10101 10 10

1

 

Here's the data after the syntax:

PatientID _400802 _400802_part1 _400802_part2 _400802_part3
140 01010 01 01 0

 

I'm sorry I don't understand your syntax well enough to trouble shoot this myself! Thanks again for your help.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 21 replies
  • 6302 views
  • 10 likes
  • 4 in conversation