SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
kindbe17
Fluorite | Level 6

could you help me to cut this strings and make variables?

 

string= "The project was in development for approximately three years at , during which time a that differed significantly from the novel was written.  acquired the rights to the novel after the project's  wrote a new adaptation of the novel shortly before the and sought to be faithful to the novel's storyline.  March 2008 and took 44 days completed on May 2; the film was shot in the states of "

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

Let's say you have a text file (c:\temp\t.txt in the code below), with the maximum length of each line of 1,000 characters.  Then you could:

 

data want (drop=_:);
  infile 'c:\temp\t.txt' truncover;
  input _line_of_text $1000. ;
  array _str {5} $200 str1-str5 ;
  do _s=1 to dim(_str) while (_line_of_text^= ' ');
    do _c=201 to 1 by -1 until (char(_line_of_text,_c)=' ');
    end;
    _str{_s}=substr(_line_of_text,1,_c);
    _line_of_text=left(substr(_line_of_text,_c));
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

10 REPLIES 10
PaigeMiller
Diamond | Level 26

Cut the string based upon what criteria?

--
Paige Miller
kindbe17
Fluorite | Level 6

maximum 200 length, should not cut the word from the middle

kindbe17
Fluorite | Level 6

need to use doo loop

PaigeMiller
Diamond | Level 26

Certainly. May I suggest that from now on these important details be included in your original post, along with any other important details not already mentioned.

--
Paige Miller
Oligolas
Barite | Level 11

It's not what I use in a professional environment but might give you an idea of the possibilites:

DATA have;
   string= "The project was in development for approximately three years at , during which time a that differed significantly from the novel was written.  acquired the rights to the novel after the project's  wrote a new adaptation of the novel shortly before the and sought to be faithful to the novel's storyline.  March 2008 and took 44 days completed on May 2; the film was shot in the states of ";
RUN;

%MACRO split(varName=,varCount=,varLength=);
   %local code;
   %do i=1 %to &varCount.;
      %if &i. eq &varCount. %then %let &&varName.&i.=%nrbquote(length &&varName.&i. $&varLength.%str(;) &&varName.&i.=substr(&varName.,%eval((&i.-1)*&varLength.+1))%str(;));
      %else %if &i. eq 1 %then %let &&varName.&i.=%nrbquote(length &&varName.&i. $&varLength.%str(;) &&varName.&i.=substr(&varName.,1,&varLength.)%str(;));
      %else %let &&varName.&i.=%nrbquote(length &&varName.&i. $&varLength.%str(;) &&varName.&i.=substr(&varName.,%eval((&i.-1)*&varLength.+1),&varLength.)%str(;));

      %let code=&code. &&&varName.&i.;
   %end;
   &code.
%MEND split;

DATA _NULL_;
   SET have;
   xvarLength=200;
   xvarCount=ceil(lengthn(string)/xvarLength);
   call execute('DATA want;set &syslast.; %nrstr(%split(varName=string,varCount='||strip(put(xvarCount,best.))||',varLength='||strip(put(xvarLength,best.))||');); run;');
   drop xvarLength xvarCount;
RUN;
________________________

- Cheers -

Tom
Super User Tom
Super User

Why would you use macro code to deal with data that is an actual variable?

Just define the variables and then loop over the words in the string and decide whether or not you can add the new word to the current short string or not.

 

Let's split the string into variables of length 50 just to demonstrate the concept.

data have;
  length string $400;
  string= "The project was in development for approximately three years at , "
       || "during which time a that differed significantly from the novel was written."
       || "  acquired the rights to the novel after the project's  wrote a new"
       || " adaptation of the novel shortly before the and sought to be faithful"
       || " to the novel's storyline.  March 2008 and took 44 days completed"
       || " on May 2; the film was shot in the states of "
  ;
run;

data want;
  set have;
  length varnum wordnum 8 word $50 ;
  array short [10] $50 ;
  varnum=1;
  do wordnum=1 to countw(string,' ');
    word=scan(string,wordnum,' ');
    if 50 < length(catx(' ',short[varnum],word)) then varnum+1;
    short[varnum]=catx(' ',short[varnum],word);
  end;
  drop varnum wordnum word;
run;

Tom_0-1683125531751.png

 

Oligolas
Barite | Level 11

@Tom because if you've ever coded SDTMs you wouldn't repeat your code in each of your datasteps, would you?

________________________

- Cheers -

Tom
Super User Tom
Super User

@Oligolas wrote:

@Tom because if you've ever coded SDTMs you wouldn't repeat your code in each of your datasteps, would you?


Not sure what you point is there.  It sounds like you are making the case for creating a SAS macro.  But if you did create a macro you would still write the macro to generate SAS code to manipulate the data.  Not try to manipulate the data using SAS MACRO statements.

Oligolas
Barite | Level 11

Correct, I would. It's okay if you're not happy with it 🙂

________________________

- Cheers -

mkeintz
PROC Star

Let's say you have a text file (c:\temp\t.txt in the code below), with the maximum length of each line of 1,000 characters.  Then you could:

 

data want (drop=_:);
  infile 'c:\temp\t.txt' truncover;
  input _line_of_text $1000. ;
  array _str {5} $200 str1-str5 ;
  do _s=1 to dim(_str) while (_line_of_text^= ' ');
    do _c=201 to 1 by -1 until (char(_line_of_text,_c)=' ');
    end;
    _str{_s}=substr(_line_of_text,1,_c);
    _line_of_text=left(substr(_line_of_text,_c));
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1617 views
  • 2 likes
  • 5 in conversation