Hi ,
I am new to SAS Programing . Below is my code trying to use SET command in New data set , but when I run the code it reads only one record from the New data set . When I combine the code from two dataset it works. But I need it to be done in two separate Dataset.
Code :
DATA DATE_CHK;
LENGTH TEMP_DATE 8 CURRENT_DAY 3;
END_DATE = TODAY();
CURRENT_DAY = DAY(TODAY());
TEMP_DATE=INTNX('MONTH',END_DATE,-3);
START_DATE=TEMP_DATE + CURRENT_DAY - 1 ;
RUN;
data g1;
Set date_chk ;
input age;
datalines;
12
234
14
;
run;
OutPut:( It is missing 234 and 14 in AGE)
Obs TEMP_DATE CURRENT_DAY END_DATE START_DATE age
21185 | 5 | 21279 | 21189 | 12 |
Please suggest.
Here's the culprit:
merge date_chk ;
Change it to this:
if _n_=1 then set date_chk ;
Here's the culprit:
merge date_chk ;
Change it to this:
if _n_=1 then set date_chk ;
@WFC2013 Are you trying to add records to the end of a data set then?
As you have seen from @Tom's explanation, this is not the simplest concept in the world. I will try to explain, but you may need to search the documentation for some examples.
The DATA step executes by leaving the DATA statement, and executing the remaining statements in the data step. It actually does this as many times as needed, until the incoming data sources run out of observations.
_N_ is an automatic counter that counts how many times the DATA step has left the DATA statement to execute the remaining statements. So the IF/THEN limits the SET statement to executing on just the time that this happens. IF/THEN prevents the DATA step from "thinking" that it should end because the SET statement has run out of observations to read in.
I don't think you're using 8.2, unless it's SAS Viya 8.2
8.2 would be older than me and I'm not that young anymore 🙂
You can check your version using:
proc product_status;run;
You should see something like:
For Base SAS Software ...
Custom version information: 9.4_M3
Personally I like to separate my steps, so I read in all the data and then combine them. This makes it easier to see where things are going wrong, In ten years, I've never merged in a data set while reading in the file 🙂
@WFC2013 wrote:
Hi I work on Mainframe SAS the version is
SAS (r) Proprietary Software Release 8.2 (TS2M0).
Thanks
Wow! That version is 16 years out of date! https://blogs.sas.com/content/iml/2013/08/02/how-old-is-your-version-of-sas-release-dates-for-sas-so...
But functionality in this thread is basic stuff that would work if you were running SAS from 40 years ago.
Just explain what you want to do. What is you existing SAS dataset? What do you want to create?
What should the data look like when you are done?
Perhaps
data g1; LENGTH TEMP_DATE 8 CURRENT_DAY 3; END_DATE = TODAY(); CURRENT_DAY = DAY(TODAY()); TEMP_DATE=INTNX('MONTH',END_DATE,-3); START_DATE=TEMP_DATE + CURRENT_DAY - 1 ; input age; datalines; 12 234 14 ; run;
The SAS data step iterates multiple times. It is simplest to think about it as making one iteration per observation. But in reality it is essentially an infinite loop and most SAS data steps don't end at the bottom of the code, but in the middle when it reads past the end of the input data.
In your case it will read past the end of old dataset it is reading with the SET statement before it reads past the end of the in-line data in the CARDS (DATALINES) block. So you get one observation since the first data step generates only one observations. (SAS is smart enough to notice that the step does not read in any data and so stop after just one iteration.)
Perhaps you really meant to just have one data step?
DATA DATE_CHK;
LENGTH TEMP_DATE 8 CURRENT_DAY 3;
END_DATE = TODAY();
CURRENT_DAY = DAY(TODAY());
TEMP_DATE=INTNX('MONTH',END_DATE,-3);
START_DATE=TEMP_DATE + CURRENT_DAY - 1 ;
input age;
datalines;
12
234
14
;
Hi ,
When I am including all the code in Under one dataset it is taking more CPU . But when I have divided the code in two Data set it is taking half of the CPU used earlier. That's why I am calculating the Start date and END date in first dataset and using those Variable in Second data set .
The resolution provided by if _N_ =1 works .
Thanks for Responding.
@WFC2013 wrote:
Hi ,
When I am including all the code in Under one dataset it is taking more CPU . But when I have divided the code in two Data set it is taking half of the CPU used earlier. That's why I am calculating the Start date and END date in first dataset and using those Variable in Second data set .
The resolution provided by if _N_ =1 works .
Thanks for Responding.
You are probably seeing CPU usage differences because it is re-calculating the constant variables for every record read from the text. You can get similar effect using RETAIN in a single data step. Here is a simplified version the show the idea. So the variable TODAY is calculated just once and then its value is stays the same for each observation without having to recall the TODAY() function.
data want ;
if _n_=1 then do;
today=today();
retain today ;
end;
input age ;
cards;
1
2
3
;
Normally you wouldn't see much difference in CPU usage for such simple assignment statements. But if the TODAY() function might be CPU intensive on your system? Or you might be reading a lot of observations. Even a fraction of millisecond adds up when multiplied by a million repetitions.
Hi Tom,
You are correct the above code is giving the same effect but still the cpu usage is not going down.
I have input file of 129632 records that needs to be split on the basis of start date and end date into two file . I don't want to include those dates to the input observation only want to use it for split.
Please suggest.
Code :
DATA DATE_CHK;
IF _N_ = 1 THEN DO;
END_DATE = TODAY();
RETAIN END_DATE;
CURRENT_DAY = DAY(TODAY());
TEMP_DATE=INTNX('MONTH',END_DATE,-3);
START_DATE=TEMP_DATE + CURRENT_DAY - 1;
RETAIN START_DATE ;
END;
INFILE INPUT1 ;--FILE HAS 129632 RECORDS
INPUT @9 CODEDT $CHARZB10.;
CODE_DATE=INPUT(CODEDT,YYMMDD10.);
FILE ACTIVE;
IF CODE_DATE >= START_DATE AND CODE_DATE <= END_DATE THEN PUT _INFILE_;
FILE INACTIVE;
IF CODE_DATE < START_DATE THEN PUT _INFILE_;
RUN;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.