BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
WFC2013
Fluorite | Level 6

Hi ,

I am new to SAS Programing . Below is my code trying to use SET command in New data set , but when I run the code it reads only one record from the New data set . When I combine the code from two dataset it works. But I need it to be done in two separate Dataset.

 

Code :

DATA DATE_CHK;                           
LENGTH TEMP_DATE 8 CURRENT_DAY 3;         
END_DATE = TODAY();                      
CURRENT_DAY = DAY(TODAY());              
TEMP_DATE=INTNX('MONTH',END_DATE,-3);    
START_DATE=TEMP_DATE + CURRENT_DAY - 1 ; 
RUN; 

 

data g1;
Set date_chk ;
input age;
datalines;
12 
234 
14 
;
run;

 

OutPut:( It is missing 234 and  14 in AGE)

Obs TEMP_DATE  CURRENT_DAY  END_DATE   START_DATE   age

211855212792118912

 

Please suggest.

1 ACCEPTED SOLUTION

Accepted Solutions
Astounding
PROC Star

Here's the culprit:

 

merge date_chk ;

 

Change it to this:

 

if _n_=1 then set date_chk ;

 

View solution in original post

14 REPLIES 14
Astounding
PROC Star

Here's the culprit:

 

merge date_chk ;

 

Change it to this:

 

if _n_=1 then set date_chk ;

 

WFC2013
Fluorite | Level 6
Sorry my bad i have used SET DATE_CHK not merge date_chk . But still it is reading only one record.

Will try code if _n_=1 then set date_chk .
Reeza
Super User

@WFC2013 Are you trying to add records to the end of a data set then?

WFC2013
Fluorite | Level 6
Thanks a lot , It is working. But what does _n_ =1 did can you explain pls.
Astounding
PROC Star

As you have seen from @Tom's explanation, this is not the simplest concept in the world.  I will try to explain, but you may need to search the documentation for some examples.

 

The DATA step executes by leaving the DATA statement, and executing the remaining statements in the data step.  It actually does this as many times as needed, until the incoming data sources run out of observations.

 

_N_ is an automatic counter that counts how many times the DATA step has left the DATA statement to execute the remaining statements.  So the IF/THEN limits the SET statement to executing on just the time that this happens.  IF/THEN prevents the DATA step from "thinking" that it should end because the SET statement has run out of observations to read in.

Reeza
Super User

I don't think you're using 8.2, unless it's SAS Viya 8.2

 

8.2 would be older than me and I'm not that young anymore 🙂

 

You can check your version using:

 

proc product_status;run;

You should see something like:

For Base SAS Software ...
Custom version information: 9.4_M3

 

Personally I like to separate my steps, so I read in all the data and then combine them. This makes it easier to see where things are going wrong, In ten years, I've never merged in a data set while reading in the file 🙂

WFC2013
Fluorite | Level 6
Hi I work on Mainframe SAS the version is
SAS (r) Proprietary Software Release 8.2 (TS2M0).
Thanks
Tom
Super User Tom
Super User

@WFC2013 wrote:
Hi I work on Mainframe SAS the version is
SAS (r) Proprietary Software Release 8.2 (TS2M0).
Thanks

Wow!  That version is 16 years out of date!  https://blogs.sas.com/content/iml/2013/08/02/how-old-is-your-version-of-sas-release-dates-for-sas-so...

 

But functionality in this thread is basic stuff that would work if you were running SAS from 40 years ago.

 

Just explain what you want to do. What is you existing SAS dataset?  What do you want to create?

ballardw
Super User

What should the data look like when you are done?

 

Perhaps

 

data g1;
   LENGTH TEMP_DATE 8 CURRENT_DAY 3;         
   END_DATE = TODAY();                      
   CURRENT_DAY = DAY(TODAY());              
   TEMP_DATE=INTNX('MONTH',END_DATE,-3);    
   START_DATE=TEMP_DATE + CURRENT_DAY - 1 ; 

   input age;
datalines;
12 
234 
14 
;
run;
WFC2013
Fluorite | Level 6
I don't want to combine both the data set .
Tom
Super User Tom
Super User

The SAS data step iterates multiple times. It is simplest to think about it as making one iteration per observation. But in reality it is essentially an infinite loop and most SAS data steps don't end at the bottom of the code, but in the middle when it reads past the end of the input data.

 

In your case it will read past the end of old dataset it is reading with the SET statement before it reads past the end of the in-line data in the CARDS (DATALINES) block.  So you get one observation since the first data step generates only one observations. (SAS is smart enough to notice that the step does not read in any data and so stop after just one iteration.)

 

Perhaps you really meant to just have one data step?

DATA DATE_CHK;
  LENGTH TEMP_DATE 8 CURRENT_DAY 3;
  END_DATE = TODAY();
  CURRENT_DAY = DAY(TODAY());
  TEMP_DATE=INTNX('MONTH',END_DATE,-3);
  START_DATE=TEMP_DATE + CURRENT_DAY - 1 ;
  input age;
datalines;
12 
234 
14 
;
WFC2013
Fluorite | Level 6

Hi ,

 

When I am including all the code in Under one dataset it is taking more CPU . But when I have divided the code in two Data set it is taking half of the CPU used earlier. That's why I am calculating the Start date and END date in first dataset and using those Variable in Second data set .

 

The resolution provided by if _N_ =1  works .

 

Thanks for Responding.

Tom
Super User Tom
Super User

@WFC2013 wrote:

Hi ,

 

When I am including all the code in Under one dataset it is taking more CPU . But when I have divided the code in two Data set it is taking half of the CPU used earlier. That's why I am calculating the Start date and END date in first dataset and using those Variable in Second data set .

 

The resolution provided by if _N_ =1  works .

 

Thanks for Responding.


You are probably seeing CPU usage differences because it is re-calculating the constant variables for every record read from the text.  You can get similar effect using RETAIN in a single data step.  Here is a simplified version the show the idea.  So the variable TODAY is calculated just once and then its value is stays the same for each observation without having to recall the TODAY() function.

data want ;
  if _n_=1 then do;
      today=today();
      retain today ;
  end;
  input age ;
cards;
1
2
3
;

Normally you wouldn't see much difference in CPU usage for such simple assignment statements.  But if the TODAY() function might be CPU intensive on your system?  Or you might be reading a lot of observations. Even a fraction of millisecond adds up when multiplied by a million repetitions.

WFC2013
Fluorite | Level 6

Hi Tom,
You are correct the above code is giving the same effect but still the cpu usage is not going down.
I have input file of 129632 records that needs to be split on the basis of start date and end date into two file . I don't want to include those dates to the input observation only want to use it for split.

 

Please suggest.

Code :

DATA DATE_CHK;                                                        
IF _N_ = 1 THEN DO;                                                   
END_DATE = TODAY();                                                   
RETAIN END_DATE;                                                      
CURRENT_DAY = DAY(TODAY());                                           
TEMP_DATE=INTNX('MONTH',END_DATE,-3);                                 
START_DATE=TEMP_DATE + CURRENT_DAY - 1;                               
RETAIN START_DATE ;                                                   
END;                                                                  
                                                                      
INFILE INPUT1   ;--FILE HAS 129632 RECORDS                               
INPUT @9 CODEDT $CHARZB10.;                                           
CODE_DATE=INPUT(CODEDT,YYMMDD10.);                                    
FILE ACTIVE;                                                          
IF CODE_DATE >= START_DATE AND CODE_DATE <= END_DATE THEN PUT _INFILE_;
FILE INACTIVE;         

IF CODE_DATE <  START_DATE THEN PUT _INFILE_;
RUN;                                                                                        

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 14 replies
  • 2337 views
  • 2 likes
  • 5 in conversation