I am having an issue with some code that I have used previously but is behaving strangely now. When I run the code below, the variable kvp in the kvp_a set appears to concatenate after the entry of the value for the race variable. After a few attempts to fix this, I decided to comment out the format line for the kvp variable, and I ended up with the rest of the information (beginning with the comma after the race value) on a second line. I thought maybe a carriage return got somehow embedded in my code in a way I can't see, so I tried using the compress function on the race variable in the PROC SQL, but that didn't work.
Also, the lengths of the first five variables (all character) in the SQL procedure are 6, 1, 7, 1, and 1 respectively.
proc sql;
create table part_a as
select distinct unitid,
deg,
cip,
race,
sex,
sum(awards) as awards
from part_a2
group by unitid, deg, cip, race, sex;
run;
data kvp_a;
set part_a;
format kvp $256.;
kvp='UNITID='||unitid||',SURVSECT=COM,PART=A,MAJORNUM=1,CIPCODE='||cip||',AWLEVEL='||deg||',RACE='||race||',SEX='||sex||',COUNT='||strip(put(awards,10.));
keep kvp;
run;
There could be LINEFEED character instead of a carriage return character.
And it need not come from your code but might be (some of) the records with the race variable depending upon how you brought the data into SAS and the original data source.
Without some actual text to examine it's going to be hard to be more specific.
You also may want to consider the use of the CATS or CATX function instead of the || operator to combine strings. When you use || you can end up with extra characters unless you use STRIP or older school TRIM(LEFT(var)).
There could be LINEFEED character instead of a carriage return character.
And it need not come from your code but might be (some of) the records with the race variable depending upon how you brought the data into SAS and the original data source.
Without some actual text to examine it's going to be hard to be more specific.
You also may want to consider the use of the CATS or CATX function instead of the || operator to combine strings. When you use || you can end up with extra characters unless you use STRIP or older school TRIM(LEFT(var)).
Thanks. I tried both the STRIP and CATS approaches in my DATA step, and both worked. Interestingly enough, when I tried using STRIP in the PROC SQL and left my DATA step unchanged, it did NOT work.
SAS will not add characters when you concatenate strings.
Use the notprint() function to identify where the rogue character comes from.
Use the compress() function to remove it.
Why do you have a FORMAT statement for a character variable? Perhaps you meant to use a LENGTH statement to define your new variable?
A FORMAT statement is for permanently attaching a specific format specification to a variable. In SAS a format specification is instructions to SAS on how it should display the value. SAS does not need special instructions for how to display character strings. So there is no need to attach formats to character variables.
SAS will guess at how to define a variable based on how you first use it. So if the first place you reference a variable is in a FORMAT statement then SAS will guess that you meant to define the variable with a type that matches the type of the format you are attaching and if it is a character format then it will guess that you wanted to define the variable to have a length that matches the width of the format specification that you attached.
What makes you think there is CR in the data? Did you look at the codes for the characters in the string? Or is it just that it is printing across multiple lines? Perhaps you just need to change the line length of where you are are trying to display the values?
How long are the variables that you are combining? Is 256 characters enough?
Try using a longer length. Also try using CATS() function instead of using || operator.
length kvp $500 ;
kvp=cats
('UNITID=',unitid
,',SURVSECT=COM,PART=A,MAJORNUM=1'
,',CIPCODE=',cip
,',AWLEVEL=',deg
,',RACE=',race
,',SEX=',sex
,',COUNT=',put(awards,10.)
);
I never thought of using LENGTH instead of FORMAT, as I'm used to using FORMAT to initialize variables which can occasionally be blank (and thus default to character). The reason I'm using it at all is because in this program, I need to also create a kvp_b, kvp_c, and kvp_d later that need to be appended to each other, for which I use the following:
data kvp;
set kvp_a kvp_b kvp_c kvp_d;
run;
While the kvp variable is only around 80 bytes in kvp_a, it ends up being around 200 bytes in kvp_d, so I set the format (now length 🙂 ) to be 256 to be certain the values in kvp_d aren't truncated when the sets are appended because kvp_a is so much shorter.
I suspected a CR because when I didn't initialize the kvp variable and sent the data to Excel to try and find a problem there, the information in each cell took up two lines, always beginning right after the value of the race variable.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.