BookmarkSubscribeRSS Feed
collegestudent2
Fluorite | Level 6

How do I modify the Data step below to decrease the processing time by specifying the records of a specific branch instead of all branches? 

 

DATA transaction; INFILE ’C:/MyRawData/BankTrans.csv’ DLM = ’,’ LRECL = 1500;
INPUT Branch Name Branch ID Trans ID Account Data MMDDYY8. Start Time Time8.
End Time Time8. Amount Balance;
RUN;

1 REPLY 1
Reeza
Super User
  • How big is the data set? 10-20 million of rows is very different than processing something that's a few hundred million rows. 
  • I don't think the code you've posted is correct or works at all, does it? I suspect things like "Branch ID" are a single field, which should then be BranchID or Branch_ID depending on your preference. You can have it named as "Branch ID" if you really want but then need to list it as 'Branch ID'n in the INPUT statement. I suspect BRANCH NAME should be a character field as well? Should it be Data or Date?
  • Depending on the answer to my first question you can either just add a WHERE statement to filter it out, but it may be better to read it all once, split it by BranchID if necessary into smaller sets and go from there. 
  • If you need to eventually process all the data, instead I would recommend learning how to use the OBS= option to limit your data when testing and then removing it for actually running your process. You can also add an OBS= observation on the INFILE statement to process only a few records while you sort out the data step to read the data and then remove it once you're set. 

 

 

DATA transaction; 
INFILE 'C:/MyRawData/BankTrans.csv' DLM = ',' DSD LRECL = 1500 OBS=50; INPUT BranchName $ BranchID $ TransID $ Account $ Date MMDDYY8. StartTime Time8. EndTime Time8. AmountBalance; IF BRANCHID = '12345'; RUN;

@collegestudent2 wrote:

How do I modify the Data step below to decrease the processing time by specifying the records of a specific branch instead of all branches? 

 

DATA transaction; INFILE ’C:/MyRawData/BankTrans.csv’ DLM = ’,’ LRECL = 1500;
INPUT Branch Name Branch ID Trans ID Account Data MMDDYY8. Start Time Time8.
End Time Time8. Amount Balance;
RUN;


EDIT: I changed the input code to read in fields like BRANCHID as characters which is the usual standard. 

 

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 1 reply
  • 497 views
  • 0 likes
  • 2 in conversation