BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
das
Obsidian | Level 7 das
Obsidian | Level 7

Looking for help with data step coding.

I've imported a very large (~22K rows) dataset output by another program as a tab delimited file. I have no trouble importing the file which consists of a class variable and several measurement variables. Unfortunately, the values of the class variable are compound and need to be parsed into two or three new variables. Here is the format and an example:

compound class variable name = "Label"

example: "Stack:B-r2-t5"

  1. "Stack:" is irrelevant and no need to keep
  2. "B" is shorthand for the local thresholding method used (Bernsen)
  3. "r2" is the (r)adius variable, a parameter used by the Bernsen method and in this example has a value of 2
  4. "t5" is the (t)hreshold variable, a second parameter used by the Bernsen method and in this example has a value of 5

I don't need the method variable since I'm only looking at the Bernsen method right now.

I do need to create two new variables, call them "radius" and "threshold", which respectively take on the numeric values following 'r' and 't' in the compound value of "Label".

Thank you,

Dave

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

It looks like you're trying to parse text fields out.

Some useful functions:

Scan will separate into your 3/4 component parts, ie STACK/B/R2/T5

Substr can extract the numeric values from r2/t5.

Untested:

radius=substr(scan(label, 3, ':-'), 2,1);

threshold=substr(scan(label, 4, , ':-'), 2,1);

View solution in original post

3 REPLIES 3
Reeza
Super User

It looks like you're trying to parse text fields out.

Some useful functions:

Scan will separate into your 3/4 component parts, ie STACK/B/R2/T5

Substr can extract the numeric values from r2/t5.

Untested:

radius=substr(scan(label, 3, ':-'), 2,1);

threshold=substr(scan(label, 4, , ':-'), 2,1);

das
Obsidian | Level 7 das
Obsidian | Level 7

Reeza,

That's great and I'm sure it'll work in the end. At the moment having a little problem with length of the numbers extracted. Here is a screen clip of a problem area where you'll see radius=20 and threshold=25 are clipped to 2 and 2:

Capture.PNG

Here is a copy of my current code:

data bernsen ;
     set bernsen_import ;
     radius=substr(scan(label, 3, ':-'), 2,1);
     threshold=substr(scan(label, 4, ':-'), 2,1);
run;

I'm looking at the SAS Help on these procedures but not there yet so thought I'd put it back out there.

Dave

das
Obsidian | Level 7 das
Obsidian | Level 7

OK, think I got it. Here is a screen capture of the trouble area:

Capture.PNG

And here is the code that procduces it:

data bernsen ;

     set bernsen_import ;

     radius=substr(scan(label, 3, ':-'), 2 );

     threshold=substr(scan(label, 4, ':-'), 2 );

run;

I finally understood that the last number in the substring statement dictates length and that it did not have to be specified. So removing it fixed the problem.

Thank you so much for your amazingly fast help. I'll remember those useful functions because I do this all the time.

Dave

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1141 views
  • 3 likes
  • 2 in conversation