ACR7.957.80
MGI4.754.00
BLD112.25109.75
CFP9.659.25
MAL8.258.10
CM45.9045.30
AZC1.991.93
CMW20.0019.00
AMZ2.702.30
GAC52.0050.25
Hello All - I am a novice in SAS trying to get my hands wet with coding. Above is the data from my notepad which i was to convert into a SAS dataset. First 3 letters go into var= Prod.code (line 6 has only CM, just 2 letters), two numbers with 2 decimals each follow. The first 2 decimal # goes under "msrp" variable and the following 2 decimal # under "holidayprice".
Please provide some directions/leads on how to approach this problem.
Thanks
Thank you for the reply. You've introduced me to new functions to deal with such a task. On the same lines, what kinds of suggestions would you give me to do well at coding in SAS. The kind of approach i need (i dont have computer science background).
Using a regular expression can solve the task. Have a look at the functions prxparse, prxmatch and prxposn.
data work.want; length ProdCode $ 3 mrsp holidayprice 8 _rx 8 ; format mrsp holidayprice 10.2; retain _rx; drop _rx; if _n_ = 1 then do; _rx = prxparse('/([A-Z]+)(\d+\.\d\d)(\d+\.\d\d)/'); end; input; if prxmatch(_rx, strip(_infile_)) then do; ProdCode = prxposn(_rx, 1, strip(_infile_)); mrsp = input(prxposn(_rx, 2, strip(_infile_)), best32.); holidayprice = input(prxposn(_rx, 3, strip(_infile_)), best32.); end; datalines; ACR7.957.80 MGI4.754.00 BLD112.25109.75 CFP9.659.25 MAL8.258.10 CM45.9045.30 AZC1.991.93 CMW20.0019.00 AMZ2.702.30 GAC52.0050.25 ; run;
Thank you for the reply. You've introduced me to new functions to deal with such a task. On the same lines, what kinds of suggestions would you give me to do well at coding in SAS. The kind of approach i need (i dont have computer science background).
Simple compress() function should suffice here (note updated as didn't see three variable requirement first time round):
data want (drop=line); infile datalines; length prod_code $3 msrp 8 holidayprice 8 line $200; input line $; prod_code=compress(line," .","d"); line=compress(line," ","a"); msrp=input(substr(line,1,index(line,".")+2),best.); holidayprice=input(substr(line,index(line,".")+2),best.); datalines; ACR7.957.80 MGI4.754.00 BLD112.25109.75 CFP9.659.25 MAL8.258.10 CM45.9045.30 AZC1.991.93 CMW20.0019.00 AMZ2.702.30 GAC52.0050.25 ; run;
Thank you for the reply. The logic you provided is good to gain deeper understanding on usual concepts. On the same lines, what kinds of suggestions would you give me to do well at coding in SAS. The kind of approach i need (i dont have computer science background).
Since you data does not contain any field delimiters you need to apply your own rules that could separate the three data values from each other:
* Before the first nonalphabetic character (use e.g. the NOTALPHA function to find that position)
* Two positions after the first decimal period (use e.g. the INDEX function to find that position)
Then you can create your three variables by e.g. using the SUBSTR function.
Thank you for the reply. All answers provided here - along with yours have shown me how i should think. On the same lines, what kinds of suggestions would you give me to do well at coding in SAS. The kind of approach i need (i dont have computer science background).
You've got already good answers of how to read your data.
In case what you've posted is all you need to read into SAS and not only a small subset of your real data then what I would do is manually shape your data in Notepad into a form which is easy to read with SAS (eg. the values you want separated by comma).
Hi Patrick - I am quite okay to code for reading flat files with delimiters. To learn more I wanted to check what if there are no such dlm in the data and how to deal in such situations.
Thanks again -
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.