BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
sms1891
Quartz | Level 8

Hi all,

I have a variable (Size_Cyst) recorded with height, length and width dimensions in a single variable. I want to extract them as 3 separate variables (height, length and width). The problem with the original variable is it is recorded as 261x 143 (x245)  to give you all an example. 261 corresponds to height, 143 for length and 245 for width.

 

All I want to do is to create 3 separate variables. The numbers before first x as height,  numbers after first x as length, i.e the middle set of numbers. And finally the numbers after second x as width, i.e in the parenthesis without x. Each of these numbers are not fixed, in the sense they may be 3 digit or 2 digit. It varies from ID to ID. However they range only from 2 digit to 3 digit ONLY. And also there are spacing issues between the numbers and X's, it is not captured consistently (e.g:  261x 143 (x245)  198 x 97 (x 160)

 

 

If I use compress I can get the numbers but I may lose the actual size. For 261x 143 (x245) i want it to be 261, 143 and 245 but not 26 1143 and 143...

 

Any help with the SAS code for this extraction would be greatly appreciated.

 

Thank you in advance!

 

Here is the example of my data set.

 

ID     Size_Cyst

1      261x 143 (x245)

2      23x 14 (x27)

3      23 x 14 (x 27)

4      243 x 14 (x234)

5      33 x 233 (x 67)

6      89x45x (x148)

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20
data have;
input ID     Size_Cyst & $20.;
datalines;
1      261x 143 (x245)
2      23x 14 (x27)
3      23 x 14 (x 27)
4      243 x 14 (x234)
5      33 x 233 (x 67)
6      89x45x (x148)
;

data want;
set have;
t=compress(translate(Size_Cyst,' ','x'),' ','kd');
height=scan(t,1);
length=scan(t,2);
width=scan(t,3);
run;

You don;t need input to convert to numeric as sas will do auto conversion when you compute arithmetic

View solution in original post

5 REPLIES 5
Reeza
Super User

Use SCAN with the X delimiter since that seems consistent.

Use COMPRESS to keep only digits

Use INPUT to convert to a numeric variable.

 

The last record seems to have an extra X, which may be problematic, but if you scan in a loop that should work still.

 


@sms1891 wrote:

Hi all,

I have a variable (Size_Cyst) recorded with height, length and width dimensions in a single variable. I want to extract them as 3 separate variables (height, length and width). The problem with the original variable is it is recorded as 261x 143 (x245)  to give you all an example. 261 corresponds to height, 143 for length and 245 for width.

 

All I want to do is to create 3 separate variables. The numbers before first x as height,  numbers after first x as length, i.e the middle set of numbers. And finally the numbers after second x as width, i.e in the parenthesis without x. Each of these numbers are not fixed, in the sense they may be 3 digit or 2 digit. It varies from ID to ID. However they range only from 2 digit to 3 digit ONLY. And also there are spacing issues between the numbers and X's, it is not captured consistently (e.g:  261x 143 (x245)  198 x 97 (x 160)

 

 

If I use compress I can get the numbers but I may lose the actual size. For 261x 143 (x245) i want it to be 261, 143 and 245 but not 26 1143 and 143...

 

Any help with the SAS code for this extraction would be greatly appreciated.

 

Thank you in advance!

 

Here is the example of my data set.

 

ID     Size_Cyst

1      261x 143 (x245)

2      23x 14 (x27)

3      23 x 14 (x 27)

4      243 x 14 (x234)

5      33 x 233 (x 67)

6      89x45x (x148)


 

sms1891
Quartz | Level 8

Sorry for the typo...last record does not have the extra X. It should be 89x45 (x148)

Astounding
PROC Star

As character variables:

 

height = scan(size_cyst, 1, 'Xx( )');

length = scan(size_cyst, 2, 'Xx( )');

width = scan(size_cyst, 3, 'Xx( )');

 

To get these variables as numeric takes a little more work.  For example:

 

height = input(scan(size_cyst, 1, 'Xx( )'), 3.);

 

Also notice that the space in the third parameter of SCAN is required.

novinosrin
Tourmaline | Level 20
data have;
input ID     Size_Cyst & $20.;
datalines;
1      261x 143 (x245)
2      23x 14 (x27)
3      23 x 14 (x 27)
4      243 x 14 (x234)
5      33 x 233 (x 67)
6      89x45x (x148)
;

data want;
set have;
t=compress(translate(Size_Cyst,' ','x'),' ','kd');
height=scan(t,1);
length=scan(t,2);
width=scan(t,3);
run;

You don;t need input to convert to numeric as sas will do auto conversion when you compute arithmetic

Ksharp
Super User
data have;
input ID     Size_Cyst & $20.;
datalines;
1      261x 143 (x245)
2      23x 14 (x27)
3      23 x 14 (x 27)
4      243 x 14 (x234)
5      33 x 233 (x 67)
6      89x45x (x148)
;

data want;
set have;
do i=1 to countw( Size_Cyst,,'kd');
  value=scan( Size_Cyst,i,,'kd'); output;
end;
drop i;
run;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 686 views
  • 1 like
  • 5 in conversation