Creating data step to input file dump. All variables are straight forward except for one containing space information. Examples of the data are:
121.440K
50956.360K
1439.280M
The filed is 11 bytes long with a size identifier as the last character (k=kilobytes,m=megabytes,etc.) I need to convert these values to a common denominator (numeric field representing bytes) and remove the size identifier character. My attempt so far, with what I know (SPACE variable being the one I'm referencing in my post):
DATA FDREPORT; INFILE INFILE; INPUT @1 VOLSER $6. @8 DSNAME $44. @53 ARCYEAR 4. @58 ARCJUL 3. @62 SPACE $11. @74 CATALOG $3. @78 EXPYEAR 4. @83 EXPJUL 3. @87 RUNYEAR 4. @92 RUNJUL 3.; IF _N_ = 1 THEN DELETE; IF INDEXC(SPACE,'K') THEN DO; SPACE=TRANSLATE(SPACE,'','K'); NUMBER=INPUT(SPACE,7.3)*1024; END; RUN;
This "seems" to be working from what I've run so far. So my questions are:
No suggestions on how to "better" read that somewhat odd numeric format.
By default SAS numeric variables are length 8, i.e. 8 bytes. That limits the number of significant digits that can be stored. You may want to check if the storage range needs to be considered for your application as this may be an issue with decimal portions of values.
Somewhat operating system dependent.
Length in Bytes
|
Largest Integer Represented Exactly
|
Exponential Notation
|
Significant Digits Retained
|
---|---|---|---|
3
|
8,192
|
213
|
3
|
4
|
2,097,152
|
221
|
6
|
5
|
536,870,912
|
229
|
8
|
6
|
137,438,953,472
|
237
|
11
|
7
|
35,184,372,088,832
|
245
|
13
|
8
|
9,007,199,254,740,992
|
253
|
15
|
Is there any specific way from the data that your "summary" is indicated in the file? For instance if the "summary" line starts with the word "Total" you can check the input buffer with something like:
input @; If _infile_ =: 'Total' then input; /* this would be the summary line*/ else do; input <your existing input statement> <other code executed for "valid" data> output; /*this output means only data from valid records gets output to the data set*/ end;
The input @ holds the input buffer. The _infile_ is a temporary variables SAS creates that holds the entire current input line of data. So you can examine it.
You may want to provide an actual example of what your last 3 or 4 rows of the infile looks like. Copy them using a plain text editor and paste the copied text into code box opened on the forum using the </> icon to preserve formatting of the text.
It would take some time for me to get this right, but could you do something like:
Read in the values as character;
Use REVERSE to get the alphabetic character at the front
Use SUBSTR to get the alphabetic character in one variable (var_1)
Use SUBSTR with no length argument to get the numeric characters in another variable (var_2)
Use REVERSE on var_2 to restore the original ordering
Use INPUT (var_2, best.) to get the mantissa into numeric format (var_3)
Define var_4 based on var_1 (k=1024, etc.
Multiply var_3 by var_4 to get var_5
Add them all up.
Don't know if this would also solve the issue of what is in the last two lines, but you could probably write some easy trapping code involving var_1 and var_2 to eliminate those records.
SteveDenham
Convert the beginning part to a number and then multiply by the right power. Are the numbers using 1024 or 1000 as the units?
data have;
input space $11.;
size=inputn(space,cats('comma',length(space)-1,'.'));
select (char(space,length(space)));
when ('K') size=size*1024;
when ('M') size=size*1024**2;
when ('G') size=size*1024**3;
otherwise ;
end;
cards;
121.440K
50,956.360K
1439.280M
123.45G
1245B
;
Obs space size 1 121.440K 124354.56 2 50,956.360K 52179312.64 3 1439.280M 1509194465.28 4 123.45G 132553428172.80 5 1245B 1245.00
Tom, these are mainframe storage numbers so I'm pretty sure they will by multiplied by 1024 due to binary storage fields. Thank you for your input and I believe this will help me along nicely.... if I can interpret it! 😁
Tom,
How would you suggest me getting a variable with YYYY with anther variable JJJ (julian date) into one recognized date field?
Var1 4., /*YYYY*/
Var2 3. /*JJJ*/
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.