How to Split a row to multiple rows?

Reply
Contributor
Posts: 43

How to Split a row to multiple rows?

Hi All,

Can anyone help me with this.

How can I split a string in a row to multiple rows, say there is variable called string='The Power to Know'. I was able to split it based on the delimiter using 'DO UNTIL' and 'SCAN' and the output would be something like this.

String

The

Power

to

Know

But am looking for the output some thing like

String

The Power

to Know

Thanks in advance

SAS Super FREQ
Posts: 8,641

How to Split a row to multiple rows?

Hi:

So, my question is WHERE do you want to split the string. Let's say you do have a variable called STRING whose value is "The Power to Know" -- where do you want to split it? A dataset has no concept of a split line. A split or a line feed or a carriage return usually belong to a cell in a report. So I could see how you might want to display, in a report, like an RTF file or an HTML file or a PDF file, the single string "The Power to Know" on 4 lines of the report or on 2 lines of the report. But I don't understand what the point is of "splitting the string" into rows. Does that mean that if your original dataset is:

obs var1   var2 string

1     xxx    yyy  The Power to Know

that you want to create a new dataset with 2 rows???

obs var1 var2 string

1     xxx  yyy  The Power

2     xxx  yyy  to Know

Or, in an output report, such as from PROC PRINT or PROC REPORT, do you want to see the string "The Power to Know" displayed on 2 lines of the output report instead of 1 line?

cynthia

Contributor
Posts: 43

How to Split a row to multiple rows?

Hi Cynthia,

     Thanks for your time. I was able to do it for the display purpose using Proc Report or ODS but was wondering whether it can be done in dataset and yes, I was looking for something like you said.

obs var1   var2 string

1     xxx    yyy  The Power to Know

that you want to create a new dataset with 2 rows???

obs var1 var2 string

1     xxx  yyy  The Power

2     xxx  yyy  to Know

Thanks

sasjourney

Esteemed Advisor
Posts: 5,065

How to Split a row to multiple rows?

Back to Cynthia's question: where do yo want to split the line?

You need some kind of rule. Is the rule to split after two words in the string?

If so, you can probably achieve what you want to play a bit further with scan and until...

/Linus

Data never sleeps
Contributor
Posts: 43

How to Split a row to multiple rows?

Hi

Yes, in this example I want to split it after two words.

Thanks

sasjourney

Esteemed Advisor
Posts: 6,867

How to Split a row to multiple rows?

Are you just asking how you would do something like the following?:

data have;

  informat string $80.;

  input string &;

  cards;

The power to know

The power to know something

The power to know how to do something else

;

data want (keep=want_string);

  set have;

  length want_string $50;

  counter=0;

  nomore=0;

  do until (nomore);

    counter+1;

    if scan(string,counter) eq "" then nomore=1;

     else want_string=

      catx(' ',want_string,scan(string,counter));

    if mod(counter,2) eq 0 then do;

      output;

      call missing(want_string);

    end;

  end;

run;

Contributor
Posts: 43

How to Split a row to multiple rows?

There you go. It worked perfectly for me, Thank you. 

Also, is there anyway that it can de done based on the length of the column. Say it is need to be split at a length of 11 in "The Power to Know how to do something else"

the first 11 characters would be "The Power t" but i don't want that to happen because that would split the word "to". instead I want to check the length and split it till there if it not splitting the word  and split till the previous delimiter if it is splitting the word.

Thanks again

SAS Super FREQ
Posts: 8,641

Re: How to Split a row to multiple rows?

Hi:

  I have to chime in again and say this sounds like a reporting issue and text wrapping on a report rather than a dataset issue. Or maybe I'm just dense and don't get the point. I don't understand how, in a DATASET, it would be useful to break a single obs into two obs so one observation has 11 characters of a text string and then a second observation in a DATASET has the rest of the text string.

  

  If you use ODS to create your report output, then ODS will take care of wrapping and won't break a word in the middle of two letters unless there are no spaces in the text string. Will ODS split after EXACTLY 11 characters??? Not automatically. Can you make ODS split in a report table after EXACTLY 11 characters, yes, you can.

 

  You would use a program similar to Art's program, but would insert a "line feed" or "carriage return" character into the text string in order to influence where ODS broke the text string.

  You said that you got the results you wanted in your report. So, what purpose is served by splitting your observations in a DATASET? Was the line breaking or line wrapping in the report not to your liking? Did you use the CELLWIDTH or WIDTH style attribute to control the wide of the cell? (Which influences how much text can wrap) What is your destination of interest? Is this really a DATASET issue or a REPORT issue?

  For example, in the report below, see the difference in wrapping based just on using the CELLWIDTH style attribute and then compare that to the use wrapping in the cell when CELLWIDTH is used with ODS ESCAPECHAR newline capability.

cynthia

data testit;

  length charvar1 charvar2 charvar3 $200 charvar4 $600;

  set sashelp.class;

  where age gt 14;

  charvar1 = 'Twas brillig and the slithy toves did gyre and gimble in the wabe.';

  charvar2 = 'All mimsy were the borogroves and the mome raths outgrabe.';

  charvar3 = 'Beware the Jabberwock, my son!~nThe jaws that bite, the claws that snatch.';

  charvar4 = catx('~n',charvar1,charvar2,charvar3);

  output;

run;

     

options orientation=landscape topmargin=.25in bottommargin=.25in

        leftmargin=.25in rightmargin=.25in;

               

ods listing close;

ods html file='c:\temp\wrap.html' style=sasweb;

ods rtf file='c:\temp\wrap.rtf';

ods pdf file='c:\temp\wrap.pdf';

ods escapechar='~';

** note that charvar3 and charvar4 have 

   explicit line feed characters

   using ODS ESCAPECHAR+n or ~n

   for RTF, PDF and HTML.

   In 9.3 can use ~{newline 1} instead of ~n;

    

proc print data=testit;

  title 'Proc Print';

  var charvar1 / style(data)={cellwidth=.5in};

  var charvar2 /style(data)={cellwidth=1.5in};

  var charvar3 / style(data)={cellwidth=3in};

  var charvar4 / style(data)={cellwidth=4in};

  var name;

run;

   

proc report data=testit nowd;

  title 'Proc Report';

  column charvar1 charvar2 charvar3 charvar4 name;

  define charvar1 / display style(column)={cellwidth=.5in};

  define charvar2/ display style(column)={cellwidth=1.5in};

  define charvar3 / display  style(column)={cellwidth=3in};

  define charvar4 / display style(column)={cellwidth=4in};

  define name/ display;

run;

ods _all_ close;

Esteemed Advisor
Posts: 6,867

Re: How to Split a row to multiple rows?

I agree with Cynthia that it would help to know what you are trying to accomplish this for.  Splitting can be done, but what would you do if a word was greater than 11 characters long?  If you take Cynthia's examples, what would you call a word?  Just any string that is separated by a space?  Or wouldn't you want to split on various characters and, if so, which ones?

If you can answer those questions, and you are doing what you ask for a reason, then you can probably figure it out yourself by just expanding the code to include those conditions.

Contributor
Posts: 43

How to Split a row to multiple rows?

Hi,

I agree with both of you, I do not have any problems while creating a report (Thanks to ODS).

But the reason I wanted to split the string was, we got the data in an excel sheet, in which they entered the longstrings on two or more rows for readability(instead of wrapping) and now I have the same string in a different dataset as a single line. Now, when I import the excel sheet into sas, a single string is on different rows(manually entered into different rows). so there is no way for me to match the dataset to the dataset from excel for flagging purposes without splitting the string in dataset. So, when I spoke with them(who created the excel data), they said that they are wrapping/splitting the text based on length. And yes, I asked the same question, what if the length of the word is more than the length set for splitting, they said that if that is the case there is no way other than to split the word.

Sorry, if I miss something.

Thanks

sasjourney

Super User
Super User
Posts: 5,960

Re: How to Split a row to multiple rows?

You can write a program to do this, but I have had success in the past with just letting SAS do it for me.

Make sure you have a simple unique ID for each observation. Then use proc report to produce a text listing that will flow the long strings for you and read it back in.

For example:

data have ;

  length id 8 string $200;

  id=1; string='this is too long for 12 colums';

  output;

  id+1; string='Another long string that will wrap to multiple rows';

  output;

run;

filename tempfile temp;

proc printto print=tempfile new; run;

proc report data=have nofs spacing=1 split='FF'x nocenter ;

  title1 'DUMP';

  column id string;

  define id / 'ID' order order=internal display width=4 spacing=0;

  define string / display flow width=12;

run;

title1;

proc printto print=print; run;

data short;

  length id 8 short $12;

  retain id;

  infile tempfile truncover;

  input check $4. @5 short $12.;

  if check in ('DUMP','ID') or short=' ' then delete;

  else if check ^= ' ' then id=input(check,4.);

  output;

  drop check;

run;

data want;

  merge short have ;

  by id;

run;

proc print; run;

Contributor
Posts: 43

Re: How to Split a row to multiple rows?

Thanks Tom, I will give it a try.

Grand Advisor
Posts: 9,308

Re: How to Split a row to multiple rows?

You can do it base on Art's code.

data have;
  informat string $80.;
  input string &;
  cards;
The power to know
The power to know something
The power to know how to do something else
;
run;

data want (keep=want_string);
  set have;
  length _string $ 80 want_string $ 11;
  retain _string;
  i=1;temp=scan(string,1);
  do while(not missing(temp));
   _string=catx(' ',_string,temp);
   if length(_string) gt lengthc(want_string) then do;
     want_string=substr(_string,1,length(_string)-length(scan(_string,-1))-1);
      output;_string=scan(_string,-1);end;
   i+1;temp=scan(string,i);
   if missing(temp) then do;want_string=_string;output;end;
   end;
   call missing(_string);
run;

Ksharp

Contributor
Posts: 43

How to Split a row to multiple rows?

Hi Everyone,

Thanks for helping me out on this.

sasjourney.

Super Contributor
Posts: 365

How to Split a row to multiple rows?

Hi,

I'm a bit late to this thread, but if you're doing clinical programming, I recommend having a look at http://www.datasavantconsulting.com/roland/Spectre/maclist2.html, specifically http://www.datasavantconsulting.com/roland/Spectre/utilmacros/splitvar.sas.  You may find this macro meets your needs.

Regards,

Scott

Post a Question
Discussion Stats
  • 14 replies
  • 6628 views
  • 6 likes
  • 7 in conversation