BookmarkSubscribeRSS Feed
learner_sas
Quartz | Level 8

Hello, I have a very long varying character field named comment. I want break that field into 7 fields.

and I use

comment_one = substr(comment,1,240);

    comment_extraone = substr(comment,241,480);

    comment_extratwo = substr(comment,481,720);

   comment_extrathree = substr(comment,721,960);

   comment_extrafour = substr(comment,961,1200);

   comment_extrafive=substr(comment,1201,1440);

   comment_extrasix = substr(comment,1441,1680);

but extracted substring looked like "I went to football game ..............then I reached ho"

since the length of comment_one is 240 it showed "ho" as last word but it is actually "home".

I am wondering if there is a way to fix this. Index function does not work since each comment

is of different length.

Thanks

5 REPLIES 5
Fugue
Quartz | Level 8

Is there a reason why each extracted comment has to be a specified number of characters? Unless your original comment field is neatly separated by number of characters (e.g. the first sub-comment always ends at 240, the second always ends at 480, etc), the results are going to be less than desireable -- as it seems you've discovered.

If you are trying to break a large comment field into smaller fields, it might make more sense to break it up based on natural sentence endings (e.g. period, question mark, exclamation mark).

esjackso
Quartz | Level 8

So would it be fair to say you want to split the comment by punctuation? Your could do something like the following:

data want;

  length comment_1 - comment_7 $ 100;

  set have;

  comment_1 = scan(comment,1,".","RP");

  comment_2 = scan(comment,2,".","RP");

  comment_3 = scan(comment,3,".","RP");

  comment_4 = scan(comment,4,".","RP");

  comment_5 = scan(comment,5,".","RP");

  comment_6 = scan(comment,6,".","RP");

  comment_7 = scan(comment,7,".","RP");

run;

Code could be condensed with an arrary and do loop but I left it alone as to not confuse matters. The R and P are optional modifiers that trim leading and trailing spaces from the result (R) and use all punctuation as delimiters (P).

EJ

learner_sas
Quartz | Level 8

I am transferring this extracted sub strings into Crystal cell and it has 256 as maximum character length. I am trying to fit as much as possible.

Astounding
PROC Star

This reminds me of the good old days before PROC REPORT performed all the text-wrapping for you.  Here's an approach.

array extra {6} $240 comment_extra1 - comment_extra6;

dum = comment;

drop dum;

_i_=0;

if dum > ' ' then do until (dum=' ');
   do _k_=241 to 1 by -1;

      if substr(dum, _k_, 1) = ' ' then do;

         _i_ + 1;

         extra{_i_} = substr(dum, 1, _k_-1);

         dum = left(substr(dum, k));

      end;

   end;

end;

It's up to you to make sure that 6 variables is enough.  Sometimes an individual word can be quite lengthy.  In one medical dictionary that I examined, several words were each over 40 characters long.

Good luck.

TomKari
Onyx | Level 15

Here's another approach. It hasn't been exhaustively tested.

Tom

data have;
input;
LongDataRecord = _infile_;
cards;
Long text string 1
Long text string 2
Long text string 3
Long text string 4
run;

data want(keep=ShortDataRecord);
length ShortDataRecord $256;
retain prxid;

if _n_=1 then
  prxid=prxparse('/\s+\S*$/o');
set have;
length TestPart $257;

do while(^missing(LongDataRecord));
  TestPart = substr(LongDataRecord, 1, 257);
  FoundPosn = prxmatch(prxid, TestPart);

  if FoundPosn > 0 then
   do;
    LastChar = FoundPosn - 1;
    FirstChar = FoundPosn + 1;
    ShortDataRecord = substr(LongDataRecord, 1, LastChar);
    LongDataRecord = left(substrn(LongDataRecord, FirstChar));
    output;
   end;
  else
   do;
    ShortDataRecord = substr(LongDataRecord, 1, 256);
    LongDataRecord = left(substrn(257, 1));
    output;
   end;
end;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 933 views
  • 0 likes
  • 5 in conversation