Can anyone explain how CATS works in this example?
In the Programming Essentials 1 course, it uses the CATS function. It's very cool, but I don't understand how it works and the documentation doesn't explain. My understanding of concatenation is the joining of strings. However, CATS appears to drop part of the string and keep the rest without specifying what to drop and what to keep. The examples in the SAS Documentation for CAT and CATS are the same, and just show joining of strings.
CATS drops "Hem__" and keeps the EW and NS.
data storm_new; set pg1.storm_summary; drop Type Hem_EW Hem_NS MinPressure Lat Lon; *Add assignment statements; Basin=upcase(basin); Name=propcase(Name); Hemisphere=cats(Hem_NS, Hem_EW); Ocean=substr(Basin, 2, 1); run;
Results in:
In your original code, you DO NOT have a string Hem_NS and another string Hem_EW. You have variables with those names. How can you tell? Because in your original code, Hem_EW is not enclosed in quotes (it would then look like "Hem_EW"), so it is a variable name. CATS uses the value of the variable, not the variable name. Had it been enclosed in quotes, "Hem_EW" is a string, and CATS would then leave it unaltered.
In the first row, variable Hem_EW has value "W". In the first row, variable Hem_NS has value "N". You can confirm this via PROC PRINT or via Viewtable or any one of many other ways to view SAS data sets. Thus, CATS takes the "N" and the "W" (the values, not the variable names) and creates "NW".
This difference between variable names and their values is a fundamental and critical piece of understanding if you are going to use SAS.
@DavidBrown wrote:
Can anyone explain how CATS works in this example?
In the Programming Essentials 1 course, it uses the CATS function. It's very cool, but I don't understand how it works and the documentation doesn't explain. My understanding of concatenation is the joining of strings. However, CATS appears to drop part of the string and keep the rest without specifying what to drop and what to keep. The examples in the SAS Documentation for CAT and CATS are the same, and just show joining of strings.
CATS drops "Hem__" and keeps the EW and NS.
data storm_new; set pg1.storm_summary; drop Type Hem_EW Hem_NS MinPressure Lat Lon; *Add assignment statements; Basin=upcase(basin); Name=propcase(Name); Hemisphere=cats(Hem_NS, Hem_EW); Ocean=substr(Basin, 2, 1); run;Results in:
CATS doesn't join the variable names Hem_EW and Hem_NS. It joins the values of the variables. So for the first row of your data, obviously Hem_NS contains value 'N' and Hem_EW contains value 'W", these are joined to produce "NW"
@PaigeMiller Thank you Paige. Yes, I see where it joins the values NS and EW. But I am confused on this "value" concept when it relates to the string. Is not the whole string "HEM_NS"? Why doesn't it join NS and EW to produce NSEW? Again, the documentation does not explain this and I would like to know how CATS tells SAS to drop HEM_, take the first character of NS and join it with the second character of EW. This is the example the documentation gives for CATS:
data _null_;
   dcl char(25) x y z a;
   dcl char(70) result;
   method init();
      x='  The   Olym'; 
      y='pic Arts Festi';
      z='  val includes works by D  ';
      a='ale Chihuly.';
      result=cats(x,y,z,a);
      put result=;
   end;
enddata;
run;
SAS writes the following output to the log:
result=The Olympic Arts Festival includes works by Dale Chihuly.
					
				
			
			
				
			
			
			
			
			
			
			
		In your original code, you DO NOT have a string Hem_NS and another string Hem_EW. You have variables with those names. How can you tell? Because in your original code, Hem_EW is not enclosed in quotes (it would then look like "Hem_EW"), so it is a variable name. CATS uses the value of the variable, not the variable name. Had it been enclosed in quotes, "Hem_EW" is a string, and CATS would then leave it unaltered.
In the first row, variable Hem_EW has value "W". In the first row, variable Hem_NS has value "N". You can confirm this via PROC PRINT or via Viewtable or any one of many other ways to view SAS data sets. Thus, CATS takes the "N" and the "W" (the values, not the variable names) and creates "NW".
This difference between variable names and their values is a fundamental and critical piece of understanding if you are going to use SAS.
@PaigeMiller Thanks. I get it now. I can see in the data set, the values for the variables. Appreciate you taking the time to explain that to me. My focus on the variable name and not the value. Thanks for setting me straight!
Hi;
It is possible that CAT and CATS are returning the SAME values. It would depend on whether there were leading or trailing blanks in the values. The only part of a string that the CAT? functions ever drop are blanks. The CAT? functions are not dropping significant characters out of your strings. The main difference in the CAT? functions have to do with the treatment of leading and trailing blanks. CATS removes both leading and trailing blanks. CATT removes only leading blanks. CAT removes both leading and trailing blanks if a numeric argument is used in the function. So, it is possible for both CATT and CATS to return the same value after being used. It really depends on whether you are concatenating numeric variables or character variables. Numeric variables, when used in a CAT function are automatically converted using the BEST format, which may leave them with leading and/or trailing blanks that may or may not need to be removed, depending on how you need to join the strings together.
However, in the Programming 1 class, for the program you're asking about if you did a PROC CONTENTS for pg1.storm_summary, you'd see that your 2 variables HEM_EW and HEM_NS are defined as both being character with a length of 1.
So in this case, it wouldn't matter what concatenate function you used. You can prove that to yourself by running this modified version of your program:
proc contents data=pg1.storm_summary;
run;
data storm_new;
	set pg1.storm_summary;
 	Basin=upcase(basin);
	Name=propcase(Name);
	Ocean=substr(Basin, 2, 1); 	
    use_cat = cats(Hem_NS, Hem_EW);
	use_catt = catt(Hem_NS, Hem_EW);
	Hemisphere=cats(Hem_NS, Hem_EW);
run;
 
proc print data=storm_new(obs=20);
  title '1) with 1 character variables, cat function not different';
  var basin name hem_NS hem_ew hemisphere use_cat use_catt;
run;
And, these are the results (showing on a few obs):
You are correct that the CAT? functions allow you to join strings and they allow you to join both numeric and character variables without getting any messages about the conversion in the log. There are other concatenation operators you could use, but then you are responsible for trimming off the leading/trailing spaces on your own. The documentation has a good example of the equivalent using other operators and functions:
Hope this helps, 
Cynthia
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
