BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
cosmid
Lapis Lazuli | Level 10

Hi,

 

If I have a dataset with a CHAR variable named YR that has values in YYYYMM, like 202101 to 202112 and I wanted to modify this dataset so that the YYYY will be 2023, 202301 to 202312. Also, this dataset is indexed.

 

I used the following code:

proc sql;
  update mydata
  set substr(YR, 1, 4) = '2023';
quit;

For some reason, I keep getting an error of Expecting =. 

 

Question 1:

What's the proper way to use the substr function in PROC SQL?

 

Question 2:

What is the correct method for modify this variable in a DATA step without losing the index?

 

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One example with actual data values

data have;
  input YR $;
datalines;
199901
199902
202109
;


proc sql;
   update have
   set yr= cats('2023',substr(yr,5));
quit;
 

Why: From documentation from Proc SQL

 

Set COLUMN=   not Set <some function call>= . You need to use the variable name as the target of the SET.

 

As for changing the value of a variable without changing the index think of what an INDEX is: as short cut for finding specific values. If you want your data set to be indexed in file order add another variable and make that the index. If you want to use this variable then reindex on the new values.

 

Now for the maybe more important part: WHY is a date value in a character variable? You lose all of the date functions and have to do gyrations with odd code like this.

 

 

 

View solution in original post

3 REPLIES 3
ballardw
Super User

One example with actual data values

data have;
  input YR $;
datalines;
199901
199902
202109
;


proc sql;
   update have
   set yr= cats('2023',substr(yr,5));
quit;
 

Why: From documentation from Proc SQL

 

Set COLUMN=   not Set <some function call>= . You need to use the variable name as the target of the SET.

 

As for changing the value of a variable without changing the index think of what an INDEX is: as short cut for finding specific values. If you want your data set to be indexed in file order add another variable and make that the index. If you want to use this variable then reindex on the new values.

 

Now for the maybe more important part: WHY is a date value in a character variable? You lose all of the date functions and have to do gyrations with odd code like this.

 

 

 

cosmid
Lapis Lazuli | Level 10
Hi ballardw, thanks for the explanation!

Regarding the index question, I was asking if there's a way using DATA step to modify a variable's value without LOSING the index associated with that dataset. I understand that when updating a dataset, the associated index file will also be updated. I tried the following code it'll delete the index file afterwards:
data have;
set have;
substr(YR,1,4) = '2023';
run;
Assuming that the dataset HAVE is indexed.

For the question regarding why the YR variable is in CHAR form and not in NUM, well, I didn't create the original dataset and just need to work on it as is. I totally agree with you that it shouldn't be in CHAR.
Tom
Super User Tom
Super User

To update data in place look at the MODIFY statement.

 

To make a NEW dataset with the same index(es) then just specify the index(es) in the DATA statement.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 733 views
  • 2 likes
  • 3 in conversation