Data solution;
SET exercice;
length var1 $200;
length b $1;
var1= UPCASE(substr(x,1,1));
Do i=2 to length (x);
if substr (x, i, 1) in (".", "?", "!")
then b=UPCASE (substr(x, i+2 , 1));
else b=substr(x,i,1);
var1= var1 || b;
end;
run;
Hello,
Please, any one has an idea about how to handle this situation (the code I tried does not work) ;
I want to find a way to make only first letter capital (in each sentence in variable "X")
Data exercise;
x= "i have an exercise, i have to find a solution? i am tired !" ;
y="Life is beautiful. I like my family"
RUN;
Thanks !
Data solution;
SET exercice;
length var1 $200;
length b $1;
var1= UPCASE(substr(x,1,1));
Do i=2 to length (x);
if substr (x, i, 1) in (".", "?", "!")
then b=UPCASE (substr(x, i+2 , 1));
else b=substr(x,i,1);
var1= var1 || b;
end;
run;
If its just character 1 in each observation then:
data solution; set exercice; substr(x,1,1)=upcase(substr(x,1,1)); run;
The substr on the left means it is to be assigned the value on the right, and we only look at the first char and length 1. The right upcases this same variable, then assigns it to the left.
If you wanted all words upcase fist letter, then use the propcase() function.
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a002598106.htm
Thanks RW9 for your reply!
What I want to do is : Make only letter capital in each sentence. So, I want to have as OUTPUT for variable "X";
I have an exercise, I have to find a solution? I am tired !" ;
Then you will have to loop over each "sentance" and do what I present. For example:
data want (keep=x); x="i have an exercise, i have to find a solution? i am tired !"; flag=1; do i=1 to lengthn(x); if char(x,i) in (",","?","!") then flag=1; else if flag=1 and char(x,i) ne " " then do; substr(x,i,1)=upcase(char(x,i)); flag=0; end; end; run;
Thanks RW9 !!!!!
It works very well !
@hawari wrote:
Thanks RW9 for your reply!
What I want to do is : Make only letter capital in each sentence. So, I want to have as OUTPUT for variable "X";
I have an exercise, I have to find a solution? I am tired !" ;
Do any of your sentences contain numeric values such as 23.4 or $123.45? Or URLs such as www.somesite.com? or emails? or acronyms or abbreviations (I.B.M. Mr. Dr. Mrs. Jr. )?
These are instances of "period" not ending a sentence.
Also note that your example text contains exactly 0 periods: comma, question mark and exclamation but no period. Depending on exact content you can't rely on ! or ? to end sentences either.
If you can provide a rule that always works for your data to identify a sentence that can likely be programmed but be prepared for exceptions to cause unexpected output. OR to find things incorrectly capitalized because someone did that on data entry.
Identifying "sentence" from text is not a trivial exercise which is one reason text analytics systems exist and are somewhat pricey.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.