BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

I had an interview the other day and one question was  "How would you create a character variable that concatenated three strings with the second string being 2 blanks?"

 

I answered that I would use the || operator and thought that would work. Later I thought the CAT operator would have been a better choice.

 

So I tried both at home and neither work-- only a single character is retained instead of two blanks. I have been trying multiple variations using the CAT function and the result is the same -- in all cases the leading and trailing/blank spaces are removed. Note that I read the documentaion for the CAT function as stating "CAT function does not remove leading or trailing blanks".

 

Here are 3 variations that I tried. Note I have defined the length of the final string CONCATCOW.  I also try a variation where I set a character variable STRINGPAD to blanks and try using that. The log is fine and notes no errors or warnings.

 

None of these work. The output is  "COW FARM" in all 3 cases not the desired "COW     FARM".

 

 

DATA TEST ;

     LENGTH CONCATCOW1 - 3 $30 STRINGPAD $8 ;

     STRINGPAD = "         " ;

     CONCATCOW1 = "COW"||"        "||"FARM" ;

     CONCATCOW2 = CAT("COW","        ","FARM") ;

     CONCATCOW3 =CAT("COW"||stringpad||"FARM") ;

 

PROC PRINT ;

 

What is going on here? How can I keep these blanks in my string?  Why are these functions stripping the blanks when the documentation appears to state otherwise?

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

The leading, trailing and embedded spaces were all retained, but will be stripped or reduced unless you include the ASIS option on the style option in proc print.

 

e.g.,:

 

data one ;
length a2 $30;
x = 'a b' ;
y = ' c d ' ;
z = ' ef ' ;
a1 = x ||y||z ;
a2 = cat(x,y,z) ;
run;

 

proc print width=full;
var a1 a2 x y z/style(data)={asis=on};
run;

 

View solution in original post

11 REPLIES 11
ballardw
Super User

Take a look at this:

DATA TEST ;

     LENGTH CONCATCOW1  $30  ;
     length stringpad $ 2;

     CONCATCOW1 = "COW"||stringpad||"FARM" ;
     varlength = length(concatcow1);
run;

You may need to try something other than Proc print to look at the result as the Html default output and proportional fonts may make it hard to see. 

Patrick
Opal | Level 21

This appears to be one of these smarter interview questions where you could get some "extra points" by explaining that SAS doesn't have a concept of NULL values and that a blank represents a Missing value - unlike to most databases which have a concept of NULL values and where a blank is just another character and you could store two blanks in a column of type VARCHAR.

 

You then could also explain that you can't explicitely store multiple blanks in a SAS character variable and that when concatenating character variables SAS simply uses the length of each variable - and then you could talk a bit about the different options for concatenation and why you normally need to use functions like trim(), left(), strip() - or concatenation functions like cats(), catx() to concatenate variable - and also why you would want to explicitely define the length of the variable holding the result of the concatenation via a Length statement (or an Attrib statement with Lenght attribute).

 

BTW: Hope your interview went well anyway and congrats that you're not just trying to "learn the right answers" but to understand how things works. Speaks a lot for you and I'm sure interviewers will get it.

PGStats
Opal | Level 21

You should check your log. CAT() and || do keep the extra blanks with the proper syntax

 

DATA TEST ;
     LENGTH CONCATCOW1-CONCATCOW4 $30 STRINGPAD $8 ;
     STRINGPAD = "         " ;
     CONCATCOW1 = "COW"||"        "||"FARM" ;
     CONCATCOW2 = CAT("COW","        ","FARM") ;
     CONCATCOW3 = "COW"||stringpad||"FARM" ;
     CONCATCOW4 = CAT("COW", stringpad, "FARM") ;
run;
 
PROC PRINT ;
run;
 

 

Output:

 

 Obs    CONCATCOW1       CONCATCOW2       CONCATCOW3       CONCATCOW4     STRINGPAD

  1   COW        FARM  COW        FARM  COW        FARM  COW        FARM
PG
PGStats
Opal | Level 21

You could also use CATX with two blanks as the separator:

 

data test;
length concatcow $30;
concatcow = catx("  ", "COW", "FARM");
run;
PG
Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

Just want to add another example of the confusion I am having with concats, this one appears even stranger than the examples given. :

 

 

data one ;

   length a1 a2 $30 ;

   x = 'a       b' ;

   y = '  c d    ' ;

   z = '    ef    ' ;

       

   a1 = x ||y||z ;

   a2 = cat(x,y,z) ;

 

proc print  ;

 

x prints as 'a b'

y prints as 'c d'

z prints as 'ef'

 

a1 prints as 'a b c d ef' ;

a2 also prints as 'a b c d ef'

 

so all blanks external and internal to the string are being stripped out :the string   '   a            b    '  becomes 'a b' when using CAT function or || operator.

 

I will be reviewing posted answers more completely and will post back my findings.

art297
Opal | Level 21

The leading, trailing and embedded spaces were all retained, but will be stripped or reduced unless you include the ASIS option on the style option in proc print.

 

e.g.,:

 

data one ;
length a2 $30;
x = 'a b' ;
y = ' c d ' ;
z = ' ef ' ;
a1 = x ||y||z ;
a2 = cat(x,y,z) ;
run;

 

proc print width=full;
var a1 a2 x y z/style(data)={asis=on};
run;

 

Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

This is frustrating.

 

I will restate the problem in a more compreshensive and testable example:

 

Create a string of 3 blanks, character "A", 3 blanks,"SPACE",3 blanks,character "B",three blanks.

 

This should be 19 characters including blanks as characters.

 

STRING =  "   A   SPACE   B   " ;

 

Create and print this string exactly as specified.

 

None of the solutions offered so far have worked as I have tested ! I should add I am using SAS University Online to run my programs. In all cases the length of the resulting variable from CAT operations is less than 19-- and so truncated of some desired blanks. 

 

The ASIS option sounded hopeful but also failed:

 

padded_string = "   A   SPACE   B    " ;

 

proc print data = problem ;

    var padded_string /style(data)={asis=ON} ;

 

Log runs without error or warning. Final sting is  "B   SPACE   C" so it keeps/displays the internal blanks but not the external ones.

 

Seems like a simple problem with no easy answer. Anyone else?

Patrick
Opal | Level 21

@Kow

You're missing or confusing a few concepts and I honestly don't know where to start explaining things to you in writing.

 

I've always considered the concepts manual as a very valuable source to better understand how SAS works and I can only recommend to you to start reading it.

There are many perls in it. Some things won't be on your level yet and then just browse through such sections as you'll remember later on where to look up things to get started when you encounter a new problem. At least that's what happened to me over and over again.

 

SAS® 9.4 Language Reference: Concepts

http://support.sas.com/documentation/cdl/en/lrcon/69852/HTML/default/viewer.htm#titlepage.htm

http://support.sas.com/documentation/cdl/en/lrcon/69852/PDF/default/lrcon.pdf

 

Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

Ok, thanks everyone for following. I think I have the answer.

 

I refined my problem a bit and made the blank spaces extra long and eliminated any leading or trailing blanks.

 

So  print_phrase ="A"||"           "||"B"||"         "||"SPACE"||"         "||"C"||"           "||"D" ;

 

This does work in the form shown.

 

It also works as

 

blanks = "       " ;

concat1 = CAT("A",blanks,"B",etc) ;

 

So the internal representation of the string is correct and the length agrees with this.

 

BUT neither of these will print from a straight PROC PRINT  -- it appears the default behavior of the SAS University PROC PRINT at least is to truncate both leading and trailing blanks and INTERNAL blanks as well.

 

This can be corrected by using the ASIS=ON option. The internal blanks are then retained without problem.

 

proc print ;

    var v1 -v10 newstring/style = {ASIS=ON} ;

 

Leading and trailing blanks seem to ALWAYS be stripped regardless of method used, direct concatenation or use of the CAT functions. They don't appear in PUT statements in the log or when PRINT with ASIS is ON. CAT is supposed to conserve leading and trailing blanks but if so they do not show up in the outputs I used.

 

So to answer my original question I would say that you can concatenate any number of blanks to a string internally. Leading and trailing blanks are typically removed from SAS functions and outputs, so you would need to be careful that the string was being correctly processed and reported.

 

 

.

Patrick
Opal | Level 21

If you don't define things explicitely in your code then SAS does a lot for you implicitely using defaults. One of these defaults is for printing character variables to use a format which left aligns the data (="removing" leading blanks).

 

You need to be carful to not mix up what SAS stores internally and how it prints these internal values (and "print" also applies if you just look at the data by opening the table in SAS Studio).

 

If you really want to see what's stored in a variable, print the value in a hex representation.

If you want to change how SAS prints a variable, assign a format to this variable. For character variables format $CHAR. preserves leading blanks.

https://support.sas.com/documentation/cdl/en/leforinforref/64790/HTML/default/viewer.htm#n1oily9h2mw...

 

data test;
  length varA varB $6;
  format varB $char6.;
  varA='   345';
  varB='   345';

  format var2A var2B $hex12.;
  var2A=VarA;
  var2B=VarB;
run;

proc print data=test;
run;

When you concatenate character variables and you don't use a function which removes either trailing or leading blanks before the concatenation then be aware that SAS uses the LENGTH of the character variables for concatenation.

Lets say you have two variable varA and varB both with a length of 6 then when you concatenat the two variables without trimming the values of varB will start at position 7 of the resulting string no matter what's stored in variableA.

data test;
length varA varB $6 concat $12;
format concat $char12.;
varA=' ';
varB=' B';
concat=varA||varB;
lastNonBlankPosition=lengthn(concat);
run;

proc print data=test;
run;

Result:

Capture.PNG

Kow
Obsidian | Level 7 Kow
Obsidian | Level 7

Thanks this is the comprehensive view of the issue I needed.

 

I think the takeaway is that any concatenation of muliple blanks, internally and particularly externally, in a character string, are sensitive to how they are constructed and represented in your selected output format. It may be necessary to modify CAT function parameters, formats and/or style elements to achieve the desired visual representation if you want to keep strings of multiple blanks

 

In general I think SAS users are unaware of this behavior. The reasoning behind it is clear from a ease of presentation view but can lead to some headaches when retaining such blanks is essential to the input and formatted outputs.

 

Hope this thread helps others, thanks all for your responses.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 11449 views
  • 1 like
  • 5 in conversation