Hi there,
I am struggling (and I don't know why) to extract the first 4 digits and the last two digits of a variable.
Here's the variable :
%let ym = 201910;I tried this command and this is not working. I don't know what I do wrong...
data _null_;
test1 = substr(&ym., 1, 4);
test2 = substr(&ym., 4, 2);
run;
Maybe it's because we're Friday...
Any help would be very appreciated.
Thanks,
SUBSTR() works on character strings. But SAS will happily convert your number 201,910 into a character string for you. The problem is it does that using the BEST12. format which will right align the result. So since your value only needs 6 digits the first 6 characters are all spaces.
Convert your number to a character string. That is easy to do in SAS when working with constants by just putting quotes around the digits. Examples:
test1 = substr("201910",1,4);
test1 = substr("&ym",1,4);If you really have a numeric variable you can use the PUT() to convert it to a character string in the right format. Examples:
test1=substr(put(X,6.),1,4);
test1=substr(left(put(X,20.)),1,4);
test1=substr(put(X,20.-L),1.4);
Or if you want numbers as the result then just use arithmetic. Examples:
year=int(201910/100);
month=mod(201910,100);
%let ym = 201910;
 
data want;
test1 = substr("&ym.", 1, 4);
test2 = substr("&ym.", 5, 2);
run;SUBSTR() works on character strings. But SAS will happily convert your number 201,910 into a character string for you. The problem is it does that using the BEST12. format which will right align the result. So since your value only needs 6 digits the first 6 characters are all spaces.
Convert your number to a character string. That is easy to do in SAS when working with constants by just putting quotes around the digits. Examples:
test1 = substr("201910",1,4);
test1 = substr("&ym",1,4);If you really have a numeric variable you can use the PUT() to convert it to a character string in the right format. Examples:
test1=substr(put(X,6.),1,4);
test1=substr(left(put(X,20.)),1,4);
test1=substr(put(X,20.-L),1.4);
Or if you want numbers as the result then just use arithmetic. Examples:
year=int(201910/100);
month=mod(201910,100);
Another option is to use the SUBSTRN function instead of the SUBSTR function, as I've learned from @Reeza only last week:
test1 = substrn(&ym., 1, 4); test2 = substrn(&ym., 5, 2);
(I had always thought the "N" in the name stands just for "null string," but it can also be a reminder of "numeric arguments allowed.")
The result is still a character string, though. There's also a pitfall if you haven't defined the length of the character variable receiving the result: "that variable is given the length of the first argument," says the documentation, but this means length 8 for any numeric constant and <=8 for a numeric variable as the first argument, regardless of the length specified in the third argument.
Yikes. That is going to cause confusion if used with with %SYSFUNC().
How will it know whether the input string is to be treated as a number or a string?
Example:
1811 %let str=123E4 ; 1812 %put str=|&str| substr=|%substr(&str,1)| substrn=|%sysfunc(substrn(&str,1))|; str=|123E4| substr=|123E4| substrn=|1230000|
I will probably stick with my macro version instead.
%macro substrn
/*----------------------------------------------------------------------
Subset string (simulation of SUBSTRN function)
----------------------------------------------------------------------*/
(string   /* String to subset */
,position /* Start position */
,length   /* Length of substring */
);
%local slen start end;
%*----------------------------------------------------------------------
Get length of string. Calculate start and end positions within string.
----------------------------------------------------------------------;
%let slen=%length(&string);
%let start=%sysfunc(max(&position,1));
%if ^%length(&length) %then %let end=&slen;
%else %let end=%sysfunc(min(&slen,&length+&position-1));
%*---------------------------------------------------------------------
Use SUBSTR to return part of string requested.
----------------------------------------------------------------------;
%if (&start <= &slen) and (&start <= &end) %then
  %substr(&string,&start,&end-&start+1)
;
%mend substrn;
@Tom wrote:
Yikes. That is going to cause confusion if used with with %SYSFUNC().
How will it know whether the input string is to be treated as a number or a string?
Agreed. Interesting that it seems to prioritize the numeric interpretation.
Quote from the documentation:
"If you call SUBSTRN by using the %SYSFUNC macro, then the macro processor resolves the first argument (string) to determine whether the argument is character or numeric. If you do not want the first argument to be evaluated as a macro expression, use one of the macro-quoting functions in the first argument."
However, none of the macro quoting functions (applied in the definition of STR or in the first argument of SUBSTRN or a combination of both) prevents that numeric evaluation and hence the result 1230000.
That's what comes of softening the distinction between numeric and character functions. (Same with CAT, etc.).
Hi @jpprovost
Your macro variable ym seems to contain a year-month value, and it might be a better idea to convert the value to a SAS date value and use the provided SAS functions to extract the year and month.
I always prefer to use SAS date and time values instead of numeric literals, because they can be presented with many different formats, and it is so easy to manipulate them with SAS functions working on date and time values.
Examples:
%let ym = 201910;
data _null_; 
	date = input("&ym",yymmn6.);
	year = year(date);
	month = month(date);
	put year= / month=;
run;
%let ym = 201910;
%let date = %sysfunc(inputn(&ym,yymmn6.));
data _null_; 
	year = year(&date);
	month = month(&date);
	put year= / month=;
run;Both gives the same result:
year=2019
month=10
This gives the possibility to make computations, e.g. compute the 6. month after without problems, when the result is in another year:
%let ym = 201910;
data _null_; 
	ym = input("&ym",yymmn6.);
	half_year_after = intnx('month',ym,6);
	put ym= yymmn6./ half_year_after= yymmn6.;
run;result:
ym=201910
half_year_after=202004
regards
Erik
Every solution seems to work, I will accept yours because the &ym macro is effectively a month-year variable.
I have a subquestion.
Let's say that :
ym = '201910'd
Is there a way to extract the string "201910" from this variable with this particular format?
Hi @jpprovost
You can't specify 201910 as a date constant, because a date constant must incude a value for the day as shown here:
44   data _null_;
45       ym = '201910'd;
              ---------
              77
ERROR: Invalid date/time/datetime constant '201910'd.
ERROR 77-185: Invalid number conversion on '201910'd.
46   run;
NOTE: The SAS System stopped processing this step because of errors.
...
47
48   data _null_;
49       ym = '01oct2019'd;
50   run;
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
if you have 201910 as a string or want to specify it as a constant, it is necessary to use an input function with the informat yymmn6. The format yymmn6. works both as informat and format, so you can go from the string 201910 to a SAS date value representing 01oct2019 and from the SAS date value back to the string 201910 as shown here:
data _null_;
	format ym ddmmyyd10.;
	ym = input('201910',yymmn6.);
	put ym=;
	str_ym = put(ym,yymmn6.);
	put str_ym=;
run;
The result is:
ym=01-10-2019
str_ym=201910
regards
Erik
@ErikLund_Jensen wrote:
Hi @jpprovost
You can't specify 201910 as a date constant, because a date constant must incude a value for the day as shown here:
44 data _null_; 45 ym = '201910'd; --------- 77 ERROR: Invalid date/time/datetime constant '201910'd. ERROR 77-185: Invalid number conversion on '201910'd. 46 run; NOTE: The SAS System stopped processing this step because of errors. ...
47 48 data _null_; 49 ym = '01oct2019'd; 50 run; NOTE: DATA statement used (Total process time): real time 0.00 secondsif you have 201910 as a string or want to specify it as a constant, it is necessary to use an input function with the informat yymmn6. The format yymmn6. works both as informat and format, so you can go from the string 201910 to a SAS date value representing 01oct2019 and from the SAS date value back to the string 201910 as shown here:
data _null_; format ym ddmmyyd10.; ym = input('201910',yymmn6.); put ym=; str_ym = put(ym,yymmn6.); put str_ym=; run;The result is:
ym=01-10-2019
str_ym=201910
regards
Erik
Ok, I understand.
So if I use ym = '01OCT2019'd is there a way to extract the string "OCT2019" from the variable?
Hi @jpprovost
Oh yes.
The date literal '01oct2019'd gives a SAS date value, which is the number of days since jan 1, 1960.
The number can be presented in any possible way of writing a date by applying a format. It is still a number, but it is shown or printed as specified by the format, see below.
The number can also be translated to a string value containing any of these formats with the put function, where the number is "written" to a character variable using the format. This is how we get 01OCT2019 from the number.
The character variable can be manipulated with SAS string functions. A substring from the 3. character gives OCT2019.
data _null_;
	date = '01oct2019'd;
	put 'Internal representation as number:   ' date;
	put 'Still a number, but shown formatted: ' date yymmn6. ' ' date mmddyy10. ' ' date date9.;
	cdate = put(date,date9.);
	put 'Converted to string variable:        ' date;
	cdate = substr(put(date,date9.),3);
	put 'Substring of string variable:        ' cdate;
run;
result:
Internal representation as number: 21823
Still a number, but shown formatted: 201910 10/01/2019 01OCT2019
Converted to string variable: 01OCT2019
Substring of string variable: OCT2019
regards
Erik
There is a format that will display the data in that form.
2318 %let ym = '10-OCT-2019'd; 2319 %let ym2 = %sysfunc(putn(&ym,monyy7)); 2320 %put &=ym &=ym2; YM='10-OCT-2019'd YM2=OCT2019
@jpprovost wrote:
Every solution seems to work, I will accept yours because the &ym macro is effectively a month-year variable.
I have a subquestion.
Let's say that :
ym = '201910'd
Is there a way to extract the string "201910" from this variable with this particular format?
That is a poor format to use since it kind of looks like a date literal, but it isn't. Data literals require that the quoted string can be read by the DATE informat.
You could use %SCAN() to remove the stuff between the quotes.
2315 %let ym = '201910'd; 2316 %let ym2 = %scan(&ym,1,''); 2317 %put &=ym &=ym2; YM='201910'd YM2=201910
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
