BookmarkSubscribeRSS Feed
collegestudent2
Fluorite | Level 6

How do I modify the Data step below to decrease the processing time by specifying the records of a specific branch instead of all branches? 

 

DATA transaction; INFILE ’C:/MyRawData/BankTrans.csv’ DLM = ’,’ LRECL = 1500;
INPUT Branch Name Branch ID Trans ID Account Data MMDDYY8. Start Time Time8.
End Time Time8. Amount Balance;
RUN;

1 REPLY 1
Reeza
Super User
  • How big is the data set? 10-20 million of rows is very different than processing something that's a few hundred million rows. 
  • I don't think the code you've posted is correct or works at all, does it? I suspect things like "Branch ID" are a single field, which should then be BranchID or Branch_ID depending on your preference. You can have it named as "Branch ID" if you really want but then need to list it as 'Branch ID'n in the INPUT statement. I suspect BRANCH NAME should be a character field as well? Should it be Data or Date?
  • Depending on the answer to my first question you can either just add a WHERE statement to filter it out, but it may be better to read it all once, split it by BranchID if necessary into smaller sets and go from there. 
  • If you need to eventually process all the data, instead I would recommend learning how to use the OBS= option to limit your data when testing and then removing it for actually running your process. You can also add an OBS= observation on the INFILE statement to process only a few records while you sort out the data step to read the data and then remove it once you're set. 

 

 

DATA transaction; 
INFILE 'C:/MyRawData/BankTrans.csv' DLM = ',' DSD LRECL = 1500 OBS=50; INPUT BranchName $ BranchID $ TransID $ Account $ Date MMDDYY8. StartTime Time8. EndTime Time8. AmountBalance; IF BRANCHID = '12345'; RUN;

@collegestudent2 wrote:

How do I modify the Data step below to decrease the processing time by specifying the records of a specific branch instead of all branches? 

 

DATA transaction; INFILE ’C:/MyRawData/BankTrans.csv’ DLM = ’,’ LRECL = 1500;
INPUT Branch Name Branch ID Trans ID Account Data MMDDYY8. Start Time Time8.
End Time Time8. Amount Balance;
RUN;


EDIT: I changed the input code to read in fields like BRANCHID as characters which is the usual standard. 

 

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 512 views
  • 0 likes
  • 2 in conversation