BookmarkSubscribeRSS Feed
sasjourney
Calcite | Level 5

Hi All,

Can anyone help me with this.

How can I split a string in a row to multiple rows, say there is variable called string='The Power to Know'. I was able to split it based on the delimiter using 'DO UNTIL' and 'SCAN' and the output would be something like this.

String

The

Power

to

Know

But am looking for the output some thing like

String

The Power

to Know

Thanks in advance

14 REPLIES 14
Cynthia_sas
SAS Super FREQ

Hi:

So, my question is WHERE do you want to split the string. Let's say you do have a variable called STRING whose value is "The Power to Know" -- where do you want to split it? A dataset has no concept of a split line. A split or a line feed or a carriage return usually belong to a cell in a report. So I could see how you might want to display, in a report, like an RTF file or an HTML file or a PDF file, the single string "The Power to Know" on 4 lines of the report or on 2 lines of the report. But I don't understand what the point is of "splitting the string" into rows. Does that mean that if your original dataset is:

obs var1   var2 string

1     xxx    yyy  The Power to Know

that you want to create a new dataset with 2 rows???

obs var1 var2 string

1     xxx  yyy  The Power

2     xxx  yyy  to Know

Or, in an output report, such as from PROC PRINT or PROC REPORT, do you want to see the string "The Power to Know" displayed on 2 lines of the output report instead of 1 line?

cynthia

sasjourney
Calcite | Level 5

Hi Cynthia,

     Thanks for your time. I was able to do it for the display purpose using Proc Report or ODS but was wondering whether it can be done in dataset and yes, I was looking for something like you said.

obs var1   var2 string

1     xxx    yyy  The Power to Know

that you want to create a new dataset with 2 rows???

obs var1 var2 string

1     xxx  yyy  The Power

2     xxx  yyy  to Know

Thanks

sasjourney

LinusH
Tourmaline | Level 20

Back to Cynthia's question: where do yo want to split the line?

You need some kind of rule. Is the rule to split after two words in the string?

If so, you can probably achieve what you want to play a bit further with scan and until...

/Linus

Data never sleeps
sasjourney
Calcite | Level 5

Hi

Yes, in this example I want to split it after two words.

Thanks

sasjourney

art297
Opal | Level 21

Are you just asking how you would do something like the following?:

data have;

  informat string $80.;

  input string &;

  cards;

The power to know

The power to know something

The power to know how to do something else

;

data want (keep=want_string);

  set have;

  length want_string $50;

  counter=0;

  nomore=0;

  do until (nomore);

    counter+1;

    if scan(string,counter) eq "" then nomore=1;

     else want_string=

      catx(' ',want_string,scan(string,counter));

    if mod(counter,2) eq 0 then do;

      output;

      call missing(want_string);

    end;

  end;

run;

sasjourney
Calcite | Level 5

There you go. It worked perfectly for me, Thank you. 

Also, is there anyway that it can de done based on the length of the column. Say it is need to be split at a length of 11 in "The Power to Know how to do something else"

the first 11 characters would be "The Power t" but i don't want that to happen because that would split the word "to". instead I want to check the length and split it till there if it not splitting the word  and split till the previous delimiter if it is splitting the word.

Thanks again

Cynthia_sas
SAS Super FREQ

Hi:

  I have to chime in again and say this sounds like a reporting issue and text wrapping on a report rather than a dataset issue. Or maybe I'm just dense and don't get the point. I don't understand how, in a DATASET, it would be useful to break a single obs into two obs so one observation has 11 characters of a text string and then a second observation in a DATASET has the rest of the text string.

  

  If you use ODS to create your report output, then ODS will take care of wrapping and won't break a word in the middle of two letters unless there are no spaces in the text string. Will ODS split after EXACTLY 11 characters??? Not automatically. Can you make ODS split in a report table after EXACTLY 11 characters, yes, you can.

 

  You would use a program similar to Art's program, but would insert a "line feed" or "carriage return" character into the text string in order to influence where ODS broke the text string.

  You said that you got the results you wanted in your report. So, what purpose is served by splitting your observations in a DATASET? Was the line breaking or line wrapping in the report not to your liking? Did you use the CELLWIDTH or WIDTH style attribute to control the wide of the cell? (Which influences how much text can wrap) What is your destination of interest? Is this really a DATASET issue or a REPORT issue?

  For example, in the report below, see the difference in wrapping based just on using the CELLWIDTH style attribute and then compare that to the use wrapping in the cell when CELLWIDTH is used with ODS ESCAPECHAR newline capability.

cynthia

data testit;

  length charvar1 charvar2 charvar3 $200 charvar4 $600;

  set sashelp.class;

  where age gt 14;

  charvar1 = 'Twas brillig and the slithy toves did gyre and gimble in the wabe.';

  charvar2 = 'All mimsy were the borogroves and the mome raths outgrabe.';

  charvar3 = 'Beware the Jabberwock, my son!~nThe jaws that bite, the claws that snatch.';

  charvar4 = catx('~n',charvar1,charvar2,charvar3);

  output;

run;

     

options orientation=landscape topmargin=.25in bottommargin=.25in

        leftmargin=.25in rightmargin=.25in;

               

ods listing close;

ods html file='c:\temp\wrap.html' style=sasweb;

ods rtf file='c:\temp\wrap.rtf';

ods pdf file='c:\temp\wrap.pdf';

ods escapechar='~';

** note that charvar3 and charvar4 have 

   explicit line feed characters

   using ODS ESCAPECHAR+n or ~n

   for RTF, PDF and HTML.

   In 9.3 can use ~{newline 1} instead of ~n;

    

proc print data=testit;

  title 'Proc Print';

  var charvar1 / style(data)={cellwidth=.5in};

  var charvar2 /style(data)={cellwidth=1.5in};

  var charvar3 / style(data)={cellwidth=3in};

  var charvar4 / style(data)={cellwidth=4in};

  var name;

run;

   

proc report data=testit nowd;

  title 'Proc Report';

  column charvar1 charvar2 charvar3 charvar4 name;

  define charvar1 / display style(column)={cellwidth=.5in};

  define charvar2/ display style(column)={cellwidth=1.5in};

  define charvar3 / display  style(column)={cellwidth=3in};

  define charvar4 / display style(column)={cellwidth=4in};

  define name/ display;

run;

ods _all_ close;

art297
Opal | Level 21

I agree with Cynthia that it would help to know what you are trying to accomplish this for.  Splitting can be done, but what would you do if a word was greater than 11 characters long?  If you take Cynthia's examples, what would you call a word?  Just any string that is separated by a space?  Or wouldn't you want to split on various characters and, if so, which ones?

If you can answer those questions, and you are doing what you ask for a reason, then you can probably figure it out yourself by just expanding the code to include those conditions.

sasjourney
Calcite | Level 5

Hi,

I agree with both of you, I do not have any problems while creating a report (Thanks to ODS).

But the reason I wanted to split the string was, we got the data in an excel sheet, in which they entered the longstrings on two or more rows for readability(instead of wrapping) and now I have the same string in a different dataset as a single line. Now, when I import the excel sheet into sas, a single string is on different rows(manually entered into different rows). so there is no way for me to match the dataset to the dataset from excel for flagging purposes without splitting the string in dataset. So, when I spoke with them(who created the excel data), they said that they are wrapping/splitting the text based on length. And yes, I asked the same question, what if the length of the word is more than the length set for splitting, they said that if that is the case there is no way other than to split the word.

Sorry, if I miss something.

Thanks

sasjourney

Tom
Super User Tom
Super User

You can write a program to do this, but I have had success in the past with just letting SAS do it for me.

Make sure you have a simple unique ID for each observation. Then use proc report to produce a text listing that will flow the long strings for you and read it back in.

For example:

data have ;

  length id 8 string $200;

  id=1; string='this is too long for 12 colums';

  output;

  id+1; string='Another long string that will wrap to multiple rows';

  output;

run;

filename tempfile temp;

proc printto print=tempfile new; run;

proc report data=have nofs spacing=1 split='FF'x nocenter ;

  title1 'DUMP';

  column id string;

  define id / 'ID' order order=internal display width=4 spacing=0;

  define string / display flow width=12;

run;

title1;

proc printto print=print; run;

data short;

  length id 8 short $12;

  retain id;

  infile tempfile truncover;

  input check $4. @5 short $12.;

  if check in ('DUMP','ID') or short=' ' then delete;

  else if check ^= ' ' then id=input(check,4.);

  output;

  drop check;

run;

data want;

  merge short have ;

  by id;

run;

proc print; run;

sasjourney
Calcite | Level 5

Thanks Tom, I will give it a try.

Ksharp
Super User

You can do it base on Art's code.

data have;
  informat string $80.;
  input string &;
  cards;
The power to know
The power to know something
The power to know how to do something else
;
run;

data want (keep=want_string);
  set have;
  length _string $ 80 want_string $ 11;
  retain _string;
  i=1;temp=scan(string,1);
  do while(not missing(temp));
   _string=catx(' ',_string,temp);
   if length(_string) gt lengthc(want_string) then do;
     want_string=substr(_string,1,length(_string)-length(scan(_string,-1))-1);
      output;_string=scan(_string,-1);end;
   i+1;temp=scan(string,i);
   if missing(temp) then do;want_string=_string;output;end;
   end;
   call missing(_string);
run;

Ksharp

sasjourney
Calcite | Level 5

Hi Everyone,

Thanks for helping me out on this.

sasjourney.

ScottBass
Rhodochrosite | Level 12

Hi,

I'm a bit late to this thread, but if you're doing clinical programming, I recommend having a look at http://www.datasavantconsulting.com/roland/Spectre/maclist2.html, specifically http://www.datasavantconsulting.com/roland/Spectre/utilmacros/splitvar.sas.  You may find this macro meets your needs.

Regards,

Scott


Please post your question as a self-contained data step in the form of "have" (source) and "want" (desired results).
I won't contribute to your post if I can't cut-and-paste your syntactically correct code into SAS.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

New Learning Events in April

 

Join us for two new fee-based courses: Administrative Healthcare Data and SAS via Live Web Monday-Thursday, April 24-27 from 1:00 to 4:30 PM ET each day. And Administrative Healthcare Data and SAS: Hands-On Programming Workshop via Live Web on Friday, April 28 from 9:00 AM to 5:00 PM ET.

LEARN MORE

Discussion stats
  • 14 replies
  • 16362 views
  • 6 likes
  • 7 in conversation