BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sms1891
Quartz | Level 8

Hi all,

I have a variable (Size_Cyst) recorded with height, length and width dimensions in a single variable. I want to extract them as 3 separate variables (height, length and width). The problem with the original variable is it is recorded as 261x 143 (x245)  to give you all an example. 261 corresponds to height, 143 for length and 245 for width.

 

All I want to do is to create 3 separate variables. The numbers before first x as height,  numbers after first x as length, i.e the middle set of numbers. And finally the numbers after second x as width, i.e in the parenthesis without x. Each of these numbers are not fixed, in the sense they may be 3 digit or 2 digit. It varies from ID to ID. However they range only from 2 digit to 3 digit ONLY. And also there are spacing issues between the numbers and X's, it is not captured consistently (e.g:  261x 143 (x245)  198 x 97 (x 160)

 

 

If I use compress I can get the numbers but I may lose the actual size. For 261x 143 (x245) i want it to be 261, 143 and 245 but not 26 1143 and 143...

 

Any help with the SAS code for this extraction would be greatly appreciated.

 

Thank you in advance!

 

Here is the example of my data set.

 

ID     Size_Cyst

1      261x 143 (x245)

2      23x 14 (x27)

3      23 x 14 (x 27)

4      243 x 14 (x234)

5      33 x 233 (x 67)

6      89x45x (x148)

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input ID     Size_Cyst & $20.;
datalines;
1      261x 143 (x245)
2      23x 14 (x27)
3      23 x 14 (x 27)
4      243 x 14 (x234)
5      33 x 233 (x 67)
6      89x45x (x148)
;

data want;
set have;
t=compress(translate(Size_Cyst,' ','x'),' ','kd');
height=scan(t,1);
length=scan(t,2);
width=scan(t,3);
run;

You don;t need input to convert to numeric as sas will do auto conversion when you compute arithmetic

View solution in original post

5 REPLIES 5
Reeza
Super User

Use SCAN with the X delimiter since that seems consistent.

Use COMPRESS to keep only digits

Use INPUT to convert to a numeric variable.

 

The last record seems to have an extra X, which may be problematic, but if you scan in a loop that should work still.

 


@sms1891 wrote:

Hi all,

I have a variable (Size_Cyst) recorded with height, length and width dimensions in a single variable. I want to extract them as 3 separate variables (height, length and width). The problem with the original variable is it is recorded as 261x 143 (x245)  to give you all an example. 261 corresponds to height, 143 for length and 245 for width.

 

All I want to do is to create 3 separate variables. The numbers before first x as height,  numbers after first x as length, i.e the middle set of numbers. And finally the numbers after second x as width, i.e in the parenthesis without x. Each of these numbers are not fixed, in the sense they may be 3 digit or 2 digit. It varies from ID to ID. However they range only from 2 digit to 3 digit ONLY. And also there are spacing issues between the numbers and X's, it is not captured consistently (e.g:  261x 143 (x245)  198 x 97 (x 160)

 

 

If I use compress I can get the numbers but I may lose the actual size. For 261x 143 (x245) i want it to be 261, 143 and 245 but not 26 1143 and 143...

 

Any help with the SAS code for this extraction would be greatly appreciated.

 

Thank you in advance!

 

Here is the example of my data set.

 

ID     Size_Cyst

1      261x 143 (x245)

2      23x 14 (x27)

3      23 x 14 (x 27)

4      243 x 14 (x234)

5      33 x 233 (x 67)

6      89x45x (x148)


 

sms1891
Quartz | Level 8

Sorry for the typo...last record does not have the extra X. It should be 89x45 (x148)

Astounding
PROC Star

As character variables:

 

height = scan(size_cyst, 1, 'Xx( )');

length = scan(size_cyst, 2, 'Xx( )');

width = scan(size_cyst, 3, 'Xx( )');

 

To get these variables as numeric takes a little more work.  For example:

 

height = input(scan(size_cyst, 1, 'Xx( )'), 3.);

 

Also notice that the space in the third parameter of SCAN is required.

novinosrin
Tourmaline | Level 20
data have;
input ID     Size_Cyst & $20.;
datalines;
1      261x 143 (x245)
2      23x 14 (x27)
3      23 x 14 (x 27)
4      243 x 14 (x234)
5      33 x 233 (x 67)
6      89x45x (x148)
;

data want;
set have;
t=compress(translate(Size_Cyst,' ','x'),' ','kd');
height=scan(t,1);
length=scan(t,2);
width=scan(t,3);
run;

You don;t need input to convert to numeric as sas will do auto conversion when you compute arithmetic

Ksharp
Super User
data have;
input ID     Size_Cyst & $20.;
datalines;
1      261x 143 (x245)
2      23x 14 (x27)
3      23 x 14 (x 27)
4      243 x 14 (x234)
5      33 x 233 (x 67)
6      89x45x (x148)
;

data want;
set have;
do i=1 to countw( Size_Cyst,,'kd');
  value=scan( Size_Cyst,i,,'kd'); output;
end;
drop i;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 641 views
  • 1 like
  • 5 in conversation