BookmarkSubscribeRSS Feed
learner_sas
Quartz | Level 8

Hello, I have a very long varying character field named comment. I want break that field into 7 fields.

and I use

comment_one = substr(comment,1,240);

    comment_extraone = substr(comment,241,480);

    comment_extratwo = substr(comment,481,720);

   comment_extrathree = substr(comment,721,960);

   comment_extrafour = substr(comment,961,1200);

   comment_extrafive=substr(comment,1201,1440);

   comment_extrasix = substr(comment,1441,1680);

but extracted substring looked like "I went to football game ..............then I reached ho"

since the length of comment_one is 240 it showed "ho" as last word but it is actually "home".

I am wondering if there is a way to fix this. Index function does not work since each comment

is of different length.

Thanks

5 REPLIES 5
Fugue
Quartz | Level 8

Is there a reason why each extracted comment has to be a specified number of characters? Unless your original comment field is neatly separated by number of characters (e.g. the first sub-comment always ends at 240, the second always ends at 480, etc), the results are going to be less than desireable -- as it seems you've discovered.

If you are trying to break a large comment field into smaller fields, it might make more sense to break it up based on natural sentence endings (e.g. period, question mark, exclamation mark).

esjackso
Quartz | Level 8

So would it be fair to say you want to split the comment by punctuation? Your could do something like the following:

data want;

  length comment_1 - comment_7 $ 100;

  set have;

  comment_1 = scan(comment,1,".","RP");

  comment_2 = scan(comment,2,".","RP");

  comment_3 = scan(comment,3,".","RP");

  comment_4 = scan(comment,4,".","RP");

  comment_5 = scan(comment,5,".","RP");

  comment_6 = scan(comment,6,".","RP");

  comment_7 = scan(comment,7,".","RP");

run;

Code could be condensed with an arrary and do loop but I left it alone as to not confuse matters. The R and P are optional modifiers that trim leading and trailing spaces from the result (R) and use all punctuation as delimiters (P).

EJ

learner_sas
Quartz | Level 8

I am transferring this extracted sub strings into Crystal cell and it has 256 as maximum character length. I am trying to fit as much as possible.

Astounding
PROC Star

This reminds me of the good old days before PROC REPORT performed all the text-wrapping for you.  Here's an approach.

array extra {6} $240 comment_extra1 - comment_extra6;

dum = comment;

drop dum;

_i_=0;

if dum > ' ' then do until (dum=' ');
   do _k_=241 to 1 by -1;

      if substr(dum, _k_, 1) = ' ' then do;

         _i_ + 1;

         extra{_i_} = substr(dum, 1, _k_-1);

         dum = left(substr(dum, k));

      end;

   end;

end;

It's up to you to make sure that 6 variables is enough.  Sometimes an individual word can be quite lengthy.  In one medical dictionary that I examined, several words were each over 40 characters long.

Good luck.

TomKari
Onyx | Level 15

Here's another approach. It hasn't been exhaustively tested.

Tom

data have;
input;
LongDataRecord = _infile_;
cards;
Long text string 1
Long text string 2
Long text string 3
Long text string 4
run;

data want(keep=ShortDataRecord);
length ShortDataRecord $256;
retain prxid;

if _n_=1 then
  prxid=prxparse('/\s+\S*$/o');
set have;
length TestPart $257;

do while(^missing(LongDataRecord));
  TestPart = substr(LongDataRecord, 1, 257);
  FoundPosn = prxmatch(prxid, TestPart);

  if FoundPosn > 0 then
   do;
    LastChar = FoundPosn - 1;
    FirstChar = FoundPosn + 1;
    ShortDataRecord = substr(LongDataRecord, 1, LastChar);
    LongDataRecord = left(substrn(LongDataRecord, FirstChar));
    output;
   end;
  else
   do;
    ShortDataRecord = substr(LongDataRecord, 1, 256);
    LongDataRecord = left(substrn(257, 1));
    output;
   end;
end;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Creating Custom Steps in SAS Studio

Check out this tutorial series to learn how to build your own steps in SAS Studio.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1724 views
  • 0 likes
  • 5 in conversation