my data file looks like the following (sorry, I can't upload the whole data due to confidentiality)
I need to read in data into sas starting from 3rd line, that is, I don't want the first two lines, I tried the following code
%let path=C:\data\output_1.ITM;
data _null_;
infile "&path." lrecl=200 pad;
input lines $100.;
if index(lines, ';ENTRY') ne 0 then do;
call symput('v1', _n_+1);stop;
end;
run;
data d1;
infile "&path" firstobs=&v1 missover;
input
@04 entry 5.
@08 measure 8.4
@17 status 3.
@20 count 8.0;
run;
Applying the code above, I got something like the following, why there are missingness in the "entry"?
Any help would be much appreciated, thanks!
I don't see any issues there. Not sure why the lines seem out of order. Then NOTES usually appear after the data step itself in the log.
Try adding some logic to display the records around that line into the log and see if there is anything different about them.
data want;
infile 'X:\aaa\bbb\ccc\Drift Analysis\WinstepsOutput\PPP068\PTA drift analysis 2020 forms_1.ITM' firstobs=3 truncover;
INPUT
ENTRY 1-6 MEASURE 7-14 ST 15-17 COUNT 18-25 SCORE 26-34 REALSE 35-41 IN_MSQ 42-48
IN_ZST 49-55 UT_MS 56-62 OUT_ZS 63-69
ISPL 70-76 PTMA 77-83 WEIGHT 84-90 OBSMA 91-96 EXPMA 97-102 PMA_E 103-108 RMSR 109-114
WMLE 115-122 G 123-124 M 125-126 R 127-128 NAME $ 129-150
;
if entry in (47 48 49) then list;
run;
Most likely the values in that line are shifted somehow. If there are special characters the LIST should show the hex codes for the characters. But it doesn't look like there are any special characters or else SAS would have printed notes about not being able to convert the text in the columns you listed into numbers.
I am not at all sure why you bother with that data _null_ step when you clearly state "I need to read in data into sas starting from 3rd line". That would be Firstobs=3
or does the first data line vary from file to file.
Show us the log. I suspect you are getting an "invalid data" message of some type. What is most likely happening is because you are hard coding the 5. and other informats on the input statement. Which means that Entry is being read with all 5 characters. I bet that if you look at your text that the 5th character after the 4 is the - in Measure values
Look at your
@04 entry 5.
@08 measure 8.4
Start at column 4 and read 5 characters, the 5th is at column 8, which is where Measure starts.
So either your informat for entry is incorrect or the start column to read Entry is incorrect, at least occasionally.
You also want to be very careful reading values when you use an informat like 8.4.
Consider the result of this
data junk; input @1 x 8.4; datalines; 12345678 ;
Compared with
data junk2; input @1 x 8.; datalines; 12345678 .3456 ;
And either you code has something seriously wrong or your data picture does. Starting at Entry=6 the measure values you claim for the data set to not match the pictured data. So perhaps you code is actually reading a different file???
Since it appears that the sensitive information is in the header copy a few rows, including the header, paste into a code box on the forum opened with the </> icon and then X over the sensitive characters. Then we can see what you file looks like.
Then post the entire data step
You are right! I do have "invalid data" message in the log. Below is part of my data. Please direct me how to correctly read in the data. Thanks a bunch!
; ITEM X:\aaaa\bbb\ccc\Winst ;ENTRY MEASURE ST COUNT SCORE REALSE IN.MSQ IN.ZST OUT.MS OUT.ZS DISPL PTMA WEIGHT OBSMA EXPMA PMA-E RMSR WMLE G M R NAME 1 .3771 2 1929.0 1369.0 .0530 1.01 .48 .98 -.59 .1168 .30 1.00 71.3 73.4 .21 .43 .3777 1 R . ABC001830_50 2 -.3008 2 1929.0 1556.0 .0660 1.09 2.20 .98 -.38 .2365 .35 1.00 80.6 84.0 .18 .38 -.2995 1 R . ABC002178_12 3 .5741 2 1929.0 1336.0 .0509 1.01 .50 1.01 .35 .0051 .21 1.00 69.4 69.8 .22 .45 .5746 1 R . ABC003680_32 4 -1.8433 2 1929.0 1830.0 .1290 1.23 2.12 1.20 1.61 .2503 .14 1.00 94.9 95.9 .10 .22 -1.8373 1 R . ABC004193_70 5 1.1928 2 1929.0 1051.0 .0476 1.02 1.29 1.02 1.32 .0552 .22 1.00 61.2 60.7 .24 .49 1.1929 1 R . ABC004243_50 6 -.3895 2 1929.0 1639.0 .0662 1.04 .89 1.12 2.17 .0085 .09 1.00 85.0 85.1 .18 .36 -.3881 1 R . ABC004449_33 7 .4117 2 1929.0 1398.0 .0523 1.00 -.06 1.00 -.06 .0036 .22 1.00 72.7 72.8 .22 .44 .4123 1 R . ABC004589_60 8 -.2457 2 1929.0 1561.0 .0654 1.11 2.70 1.13 2.57 .1636 .18 1.00 81.1 83.3 .18 .39 -.2445 1 R . ABC004761_21 9 -.5543 2 1929.0 1676.0 .0687 1.00 .00 1.00 .00 .0093 .18 1.00 87.0 87.0 .17 .33 -.5527 1 R . ABC004819_32 10 -1.5337 2 1929.0 1823.0 .1013 .97 -.28 .86 -1.38 .0128 .22 1.00 94.5 94.6 .12 .22 -1.5293 1 R . ABC004828_32 11 .7255 2 1929.0 1277.0 .0508 1.05 2.97 1.06 2.84 .0013 .13 1.00 63.9 67.1 .23 .47 .7259 1 R . ABC004984_32 12 .1623 2 1929.0 1484.0 .0554 1.00 .11 1.01 .18 .0053 .20 1.00 76.9 77.1 .20 .41 .1631 1 R . ABC005184_22 13 .7938 2 1929.0 1347.0 .0490 .93 -4.10 .92 -4.10 -.2443 .24 1.00 69.9 65.9 .23 .45 .7941 1 R . ABC005187_31 14 -.0751 2 1929.0 1503.0 .0617 1.09 2.64 1.09 2.00 .1851 .24 1.00 78.1 80.9 .19 .40 -.0741 1 R . ABC005208_33 15 -.5217 2 1929.0 1669.0 .0681 1.00 .12 1.01 .26 .0091 .17 1.00 86.6 86.6 .17 .34 -.5201 1 R . ABC005262_60 16 -.0852 2 1929.0 1559.0 .0599 1.03 .78 1.05 1.07 .0089 .15 1.00 80.7 81.0 .19 .39 -.0842 1 R . ABC005307_70 17 -2.4582 2 1929.0 1885.0 .1553 1.02 .16 1.05 .37 .0146 .06 1.00 97.7 97.8 .08 .15 -2.4470 1 R . ABC005414_70 18 1.3792 2 1929.0 1092.0 .0469 .98 -1.47 .98 -1.65 -.2220 .26 1.00 59.3 59.7 .24 .48 1.3792 1 R . ABC005475_23 19 -.7236 2 1929.0 1740.0 .0730 .87 -2.53 .80 -3.27 -.1616 .21 1.00 90.2 88.7 .16 .29 -.7216 1 R . ABC005571_40 20 -1.1321 2 1929.0 1776.0 .0855 .99 -.08 .93 -.81 .0115 .18 1.00 92.1 92.1 .14 .27 -1.1291 1 R . ABC005601_70
Since most of your columns are numbers, you can very easily write the data step:
data want;
infile "path to your file" truncover;
input @;
if indexc(_infile_,";") ne 1; /* skips all comment lines */
input
ENTRY MEASURE ST COUNT SCORE REALSE IN_MSQ IN_ZST OUT_MS OUT_ZS
DISPL PTMA WEIGHT OBSMA EXPMA PMA_E RMSR WMLE G M :$1. R NAME :$20.
;
run;
I only changed a few names to make them valid, and added the informats for the character variables. Apart from that, it's a simple copy/paste from the file.
@Kurt_Bremser @Tom Thanks much for the teaching!
using your code, the data I read in is not exactly the same as the original. The following is a comparison on the first entry and entry 48 between the original data and the data read in by SAS.
ENTRY MEASURE ST COUNT SCORE REALSE IN.MSQ IN.ZST OUT.MS OUT.ZS DISPL PTMA WEIGHT OBSMA EXPMA PMA-E RMSR WMLE G M R NAME original 1 0.3771 2 1929 1369 0.0530 1.01 0.48 0.98 -0.59 0.1168 0.30 1 71.3 73.4 0.21 0.43 0.3777 1 R . PTA001830_50 SAS 1 0.3771 2 1929 1369 0.0533 1.02 0.69 0.99 -0.39 0.1219 0.31 1 71.6 73.5 0.23 0.43 0.3777 1 R . PTA001830_50 …. original 48 -1.2679 1 320 298 0.2238 1 0.05 0.88 -0.5 0.0014 0.18 1 93.1 93.1 0.14 0.25 -1.2473 1 R . PTA008843_24_PTA06801 SAS 48 0 -3 1 0 0 1 0 1 0 0 0 1 100 100 0 0 0 1 R . PTA008843_24_PTA06801
For your convenience, below is what the original data look like from entry 1 to 50
; ITEM X:\aaa\bbb\ccc\Winst Aug 24 14:22 2020 ;ENTRY MEASURE ST COUNT SCORE REALSE IN.MSQ IN.ZST OUT.MS OUT.ZS DISPL PTMA WEIGHT OBSMA EXPMA PMA-E RMSR WMLE G M R NAME 1 .3771 2 1929.0 1369.0 .0530 1.01 .48 .98 -.59 .1168 .30 1.00 71.3 73.4 .21 .43 .3777 1 R . PTA001830_50 2 -.3008 2 1929.0 1556.0 .0660 1.09 2.20 .98 -.38 .2365 .35 1.00 80.6 84.0 .18 .38 -.2995 1 R . PTA002178_12 3 .5741 2 1929.0 1336.0 .0509 1.01 .50 1.01 .35 .0051 .21 1.00 69.4 69.8 .22 .45 .5746 1 R . PTA003680_32 4 -1.8433 2 1929.0 1830.0 .1290 1.23 2.12 1.20 1.61 .2503 .14 1.00 94.9 95.9 .10 .22 -1.8373 1 R . PTA004193_70 5 1.1928 2 1929.0 1051.0 .0476 1.02 1.29 1.02 1.32 .0552 .22 1.00 61.2 60.7 .24 .49 1.1929 1 R . PTA004243_50 6 -.3895 2 1929.0 1639.0 .0662 1.04 .89 1.12 2.17 .0085 .09 1.00 85.0 85.1 .18 .36 -.3881 1 R . PTA004449_33 7 .4117 2 1929.0 1398.0 .0523 1.00 -.06 1.00 -.06 .0036 .22 1.00 72.7 72.8 .22 .44 .4123 1 R . PTA004589_60 8 -.2457 2 1929.0 1561.0 .0654 1.11 2.70 1.13 2.57 .1636 .18 1.00 81.1 83.3 .18 .39 -.2445 1 R . PTA004761_21 9 -.5543 2 1929.0 1676.0 .0687 1.00 .00 1.00 .00 .0093 .18 1.00 87.0 87.0 .17 .33 -.5527 1 R . PTA004819_32 10 -1.5337 2 1929.0 1823.0 .1013 .97 -.28 .86 -1.38 .0128 .22 1.00 94.5 94.6 .12 .22 -1.5293 1 R . PTA004828_32 11 .7255 2 1929.0 1277.0 .0508 1.05 2.97 1.06 2.84 .0013 .13 1.00 63.9 67.1 .23 .47 .7259 1 R . PTA004984_32 12 .1623 2 1929.0 1484.0 .0554 1.00 .11 1.01 .18 .0053 .20 1.00 76.9 77.1 .20 .41 .1631 1 R . PTA005184_22 13 .7938 2 1929.0 1347.0 .0490 .93 -4.10 .92 -4.10 -.2443 .24 1.00 69.9 65.9 .23 .45 .7941 1 R . PTA005187_31 14 -.0751 2 1929.0 1503.0 .0617 1.09 2.64 1.09 2.00 .1851 .24 1.00 78.1 80.9 .19 .40 -.0741 1 R . PTA005208_33 15 -.5217 2 1929.0 1669.0 .0681 1.00 .12 1.01 .26 .0091 .17 1.00 86.6 86.6 .17 .34 -.5201 1 R . PTA005262_60 16 -.0852 2 1929.0 1559.0 .0599 1.03 .78 1.05 1.07 .0089 .15 1.00 80.7 81.0 .19 .39 -.0842 1 R . PTA005307_70 17 -2.4582 2 1929.0 1885.0 .1553 1.02 .16 1.05 .37 .0146 .06 1.00 97.7 97.8 .08 .15 -2.4470 1 R . PTA005414_70 18 1.3792 2 1929.0 1092.0 .0469 .98 -1.47 .98 -1.65 -.2220 .26 1.00 59.3 59.7 .24 .48 1.3792 1 R . PTA005475_23 19 -.7236 2 1929.0 1740.0 .0730 .87 -2.53 .80 -3.27 -.1616 .21 1.00 90.2 88.7 .16 .29 -.7216 1 R . PTA005571_40 20 -1.1321 2 1929.0 1776.0 .0855 .99 -.08 .93 -.81 .0115 .18 1.00 92.1 92.1 .14 .27 -1.1291 1 R . PTA005601_70 21 -.7297 2 1929.0 1711.0 .0741 1.03 .50 1.08 1.21 .0101 .10 1.00 88.7 88.8 .16 .32 -.7277 1 R . PTA005603_60 22 -1.7209 2 1929.0 1840.0 .1099 .99 -.08 .91 -.78 .0132 .17 1.00 95.4 95.4 .11 .21 -1.7155 1 R . PTA005610_23 23 .8286 2 1929.0 1304.0 .0488 .94 -3.86 .93 -3.90 -.1693 .27 1.00 69.1 65.3 .23 .45 .8289 1 R . PTA005690_33 24 -.6034 2 1929.0 1677.0 .0711 1.03 .70 1.05 .81 .0541 .17 1.00 86.9 87.5 .16 .33 -.6017 1 R . PTA005920_31 25 -.6868 2 1929.0 1675.0 .0755 1.10 1.82 1.06 .90 .1475 .21 1.00 86.9 88.4 .16 .33 -.6849 1 R . PTA006330_11 26 -.0967 2 1929.0 1596.0 .0593 .98 -.50 1.06 1.28 -.1148 .07 1.00 82.5 81.2 .19 .38 -.0957 1 R . PTA006348_32 27 .2022 2 1929.0 1474.0 .0554 1.02 .77 1.03 1.01 -.0044 .16 1.00 76.4 76.5 .21 .42 .2029 1 R . PTA006627_13 28 -1.2412 2 1929.0 1780.0 .0930 1.08 1.09 1.16 1.78 .0919 .10 1.00 92.3 92.9 .13 .26 -1.2379 1 R . PTA006630_11 29 -1.5136 2 1929.0 1821.0 .1004 .98 -.21 .84 -1.63 .0128 .21 1.00 94.4 94.5 .12 .22 -1.5093 1 R . PTA006738_13 30 1.9252 2 1929.0 751.0 .0491 1.04 2.98 1.05 2.79 -.0087 .16 1.00 60.2 63.6 .24 .48 1.9250 1 R . PTA006924_31 31 -.2486 2 1929.0 1547.0 .0669 1.16 3.88 1.20 3.91 .2150 .15 1.00 80.4 83.3 .18 .40 -.2474 1 R . PTA007213_60 32 .1530 2 1929.0 1487.0 .0555 1.00 -.01 .98 -.53 .0054 .22 1.00 77.2 77.3 .20 .41 .1538 1 R . PTA007275_35 33 .7798 2 1929.0 1324.0 .0491 .87 -7.76 .85 -7.74 -.1709 .39 1.00 72.1 66.2 .23 .43 .7801 1 R . PTA007397_35 34 -1.3984 2 1929.0 1804.0 .0979 1.05 .62 1.06 .69 .0561 .12 1.00 93.5 93.8 .12 .24 -1.3945 1 R . PTA007464_29 35 .1500 2 1929.0 1488.0 .0567 1.04 1.44 1.06 1.70 .0053 .12 1.00 77.1 77.3 .20 .42 .1508 1 R . PTA007703_29 36 1.6427 2 1929.0 875.0 .0474 1.01 .95 1.02 1.53 -.0062 .22 1.00 59.9 60.5 .24 .49 1.6426 1 R . PTA007725_24 37 1.1037 2 1929.0 1164.0 .0476 1.01 .46 1.00 .06 -.1101 .20 1.00 60.3 61.6 .24 .48 1.1039 1 R . PTA007854_29 38 1.0442 2 1929.0 1067.0 .0492 1.06 4.81 1.07 4.39 .1684 .17 1.00 57.6 62.2 .24 .49 1.0444 1 R . PTA007975_24 39 -.5309 2 1929.0 1671.0 .0682 .96 -.74 .87 -2.30 .0091 .28 1.00 86.7 86.7 .17 .33 -.5293 1 R . PTA008026_23 40 -2.1927 2 1929.0 1879.0 .1361 .87 -1.03 .68 -2.43 -.1197 .17 1.00 97.4 97.1 .09 .16 -2.1841 1 R . PTA008392_11 41 .7410 2 1929.0 1274.0 .0494 .97 -1.95 .95 -2.26 -.0069 .29 1.00 67.5 66.8 .23 .45 .7414 1 R . PTA008409_32 42 1.7573 2 1929.0 756.0 .0485 1.05 3.77 1.06 3.81 .1468 .10 1.00 59.0 61.5 .24 .49 1.7571 1 R . PTA008437_35 43 .2792 2 1929.0 1423.0 .0552 1.05 1.88 1.05 1.55 .0673 .18 1.00 73.8 75.1 .21 .43 .2799 1 R . PTA008511_13 44 1.3313 2 1929.0 1127.0 .0472 1.01 .79 1.01 .86 -.2530 .20 1.00 59.9 59.9 .24 .49 1.3313 1 R . PTA008589_25 45 .0062 2 1929.0 1539.0 .0576 1.00 -.06 1.00 .10 -.0145 .18 1.00 79.4 79.6 .20 .39 .0071 1 R . PTA008618_29 46 1.0992 2 1929.0 1048.0 .0484 1.04 3.00 1.04 2.93 .1555 .21 1.00 58.7 61.6 .24 .49 1.0994 1 R . PTA008673_80 47 .5351 2 1929.0 1352.0 .0522 1.04 2.06 1.06 2.20 .0027 .14 1.00 69.7 70.5 .22 .46 .5356 1 R . PTA008822_33 48 -1.2679 1 320.0 298.0 .2238 1.00 .05 .88 -.50 .0014 .18 1.00 93.1 93.1 .14 .25 -1.2473 1 R . PTA008843_24_PTA06801 49 1.8551 1 328.0 133.0 .1155 .99 -.41 .98 -.40 .0020 .25 1.00 64.3 62.2 .23 .47 1.8539 1 R . PTA008851_24_PTA06802 50 1.4776 2 1929.0 879.0 .0469 .93 -6.70 .92 -6.57 .1495 .37 1.00 65.5 59.7 .24 .47 1.4776 1 R . PTA008879_26
Could you please teach me how to modify your code so that I can correctly read in the data?
Please post the log lines from the step you used to read the file.
@Tom Please see log below
969 missing R;
NOTE: The infile 'aaa\bbb\ccc\Drift Analysis\PTA drift analysis 2020 forms_1.ITM' is:
Filename=aaa\bbb\ccc\Drift AnalysisTA drift analysis 2020 forms_1.ITM,
RECFM=V,LRECL=32767,File Size (bytes)=67294,
Last Modified=24Aug2020:14:02:42,
Create Time=24Aug2020:14:02:42
NOTE: 450 records were read from the infile 'aaa\bbb\ccc\Drift Analysis\PTA drift analysis 2020 forms_1.ITM'.
The minimum record length was 141.
The maximum record length was 150.
NOTE: The data set WORK.WANT has 450 observations and 22 variables.
NOTE: DATA statement used (Total process time):
real time 32.18 seconds
cpu time 1.26 seconds
970 data want;
971 infile 'X:\aaa\bbb\ccc\Drift Analysis\Winsteps
971! Output\PPP068\PTA drift analysis 2020 forms_1.ITM' firstobs=3 truncover;
972 INPUT
973 ENTRY 1-6 MEASURE 7-14 ST 15-17 COUNT 18-25 SCORE 26-34 REALSE 35-41 IN_MSQ 42-48 IN_ZST
973! 49-55 UT_MS 56-62 OUT_ZS 63-69
974 ISPL 70-76 PTMA 77-83 WEIGHT 84-90 OBSMA 91-96 EXPMA 97-102 PMA_E 103-108 RMSR 109-114 WMLE
974! 115-122 G 123-124 M 125-126 R 127-128 NAME $ 129-150;
975 ;
I don't see any issues there. Not sure why the lines seem out of order. Then NOTES usually appear after the data step itself in the log.
Try adding some logic to display the records around that line into the log and see if there is anything different about them.
data want;
infile 'X:\aaa\bbb\ccc\Drift Analysis\WinstepsOutput\PPP068\PTA drift analysis 2020 forms_1.ITM' firstobs=3 truncover;
INPUT
ENTRY 1-6 MEASURE 7-14 ST 15-17 COUNT 18-25 SCORE 26-34 REALSE 35-41 IN_MSQ 42-48
IN_ZST 49-55 UT_MS 56-62 OUT_ZS 63-69
ISPL 70-76 PTMA 77-83 WEIGHT 84-90 OBSMA 91-96 EXPMA 97-102 PMA_E 103-108 RMSR 109-114
WMLE 115-122 G 123-124 M 125-126 R 127-128 NAME $ 129-150
;
if entry in (47 48 49) then list;
run;
Most likely the values in that line are shifted somehow. If there are special characters the LIST should show the hex codes for the characters. But it doesn't look like there are any special characters or else SAS would have printed notes about not being able to convert the text in the columns you listed into numbers.
@Tom @Kurt_Bremser @jimbarbour @Astounding @ballardw
Thanks so much to all of you!!
Very much appreciate your time and help!!
@superbug wrote:
970 data want;
971 infile 'X:\aaa\bbb\ccc\Drift Analysis\Winsteps
971! Output\PPP068\PTA drift analysis 2020 forms_1.ITM' firstobs=3 truncover;
972 INPUT
973 ENTRY 1-6 MEASURE 7-14 ST 15-17 COUNT 18-25 SCORE 26-34 REALSE 35-41 IN_MSQ 42-48 IN_ZST
973! 49-55 UT_MS 56-62 OUT_ZS 63-69
974 ISPL 70-76 PTMA 77-83 WEIGHT 84-90 OBSMA 91-96 EXPMA 97-102 PMA_E 103-108 RMSR 109-114 WMLE
974! 115-122 G 123-124 M 125-126 R 127-128 NAME $ 129-150;
975 ;
That is not my code.
I believe that the program generated by Enterprise Guide reads line 48 correctly. See my earlier reply for source code.
Jim
Check the file for strange characters that are eliminated when you copy and paste into the forum.
190 missing R ; 191 data want; 192 infile cards /* 'X:\aaaa\bbb\ccc\Winst' */ firstobs=3 truncover; 193 input 194 ENTRY MEASURE ST COUNT SCORE REALSE IN_MSQ IN_ZST OUT_MS OUT_ZS 195 DISPL PTMA WEIGHT OBSMA EXPMA PMA_E RMSR WMLE G M R 196 NAME :$20. 197 ; 198 if entry=48 then put (_all_) (=/); 199 cards4; ENTRY=48 MEASURE=-1.2679 ST=1 COUNT=320 SCORE=298 REALSE=0.2238 IN_MSQ=1 IN_ZST=0.05 OUT_MS=0.88 OUT_ZS=-0.5 DISPL=0.0014 PTMA=0.18 WEIGHT=1 OBSMA=93.1 EXPMA=93.1 PMA_E=0.14 RMSR=0.25 WMLE=-1.2473 G=1 M=R R=. NAME=PTA008843_24_PTA0680
I copy/pasted your data into a DATALINES4 block and ran my code. The only change I made was to lengthen the informat for name (21):
data want;
input @;
if indexc(_infile_,";") ne 1; /* skips all comment lines */
input
ENTRY MEASURE ST COUNT SCORE REALSE IN_MSQ IN_ZST OUT_MS OUT_ZS
DISPL PTMA WEIGHT OBSMA EXPMA PMA_E RMSR WMLE G M :$1. R NAME :$21.
;
datalines4;
; ITEM X:\aaa\bbb\ccc\Winst Aug 24 14:22 2020
;ENTRY MEASURE ST COUNT SCORE REALSE IN.MSQ IN.ZST OUT.MS OUT.ZS DISPL PTMA WEIGHT OBSMA EXPMA PMA-E RMSR WMLE G M R NAME
1 .3771 2 1929.0 1369.0 .0530 1.01 .48 .98 -.59 .1168 .30 1.00 71.3 73.4 .21 .43 .3777 1 R . PTA001830_50
2 -.3008 2 1929.0 1556.0 .0660 1.09 2.20 .98 -.38 .2365 .35 1.00 80.6 84.0 .18 .38 -.2995 1 R . PTA002178_12
3 .5741 2 1929.0 1336.0 .0509 1.01 .50 1.01 .35 .0051 .21 1.00 69.4 69.8 .22 .45 .5746 1 R . PTA003680_32
4 -1.8433 2 1929.0 1830.0 .1290 1.23 2.12 1.20 1.61 .2503 .14 1.00 94.9 95.9 .10 .22 -1.8373 1 R . PTA004193_70
5 1.1928 2 1929.0 1051.0 .0476 1.02 1.29 1.02 1.32 .0552 .22 1.00 61.2 60.7 .24 .49 1.1929 1 R . PTA004243_50
6 -.3895 2 1929.0 1639.0 .0662 1.04 .89 1.12 2.17 .0085 .09 1.00 85.0 85.1 .18 .36 -.3881 1 R . PTA004449_33
7 .4117 2 1929.0 1398.0 .0523 1.00 -.06 1.00 -.06 .0036 .22 1.00 72.7 72.8 .22 .44 .4123 1 R . PTA004589_60
8 -.2457 2 1929.0 1561.0 .0654 1.11 2.70 1.13 2.57 .1636 .18 1.00 81.1 83.3 .18 .39 -.2445 1 R . PTA004761_21
9 -.5543 2 1929.0 1676.0 .0687 1.00 .00 1.00 .00 .0093 .18 1.00 87.0 87.0 .17 .33 -.5527 1 R . PTA004819_32
10 -1.5337 2 1929.0 1823.0 .1013 .97 -.28 .86 -1.38 .0128 .22 1.00 94.5 94.6 .12 .22 -1.5293 1 R . PTA004828_32
11 .7255 2 1929.0 1277.0 .0508 1.05 2.97 1.06 2.84 .0013 .13 1.00 63.9 67.1 .23 .47 .7259 1 R . PTA004984_32
12 .1623 2 1929.0 1484.0 .0554 1.00 .11 1.01 .18 .0053 .20 1.00 76.9 77.1 .20 .41 .1631 1 R . PTA005184_22
13 .7938 2 1929.0 1347.0 .0490 .93 -4.10 .92 -4.10 -.2443 .24 1.00 69.9 65.9 .23 .45 .7941 1 R . PTA005187_31
14 -.0751 2 1929.0 1503.0 .0617 1.09 2.64 1.09 2.00 .1851 .24 1.00 78.1 80.9 .19 .40 -.0741 1 R . PTA005208_33
15 -.5217 2 1929.0 1669.0 .0681 1.00 .12 1.01 .26 .0091 .17 1.00 86.6 86.6 .17 .34 -.5201 1 R . PTA005262_60
16 -.0852 2 1929.0 1559.0 .0599 1.03 .78 1.05 1.07 .0089 .15 1.00 80.7 81.0 .19 .39 -.0842 1 R . PTA005307_70
17 -2.4582 2 1929.0 1885.0 .1553 1.02 .16 1.05 .37 .0146 .06 1.00 97.7 97.8 .08 .15 -2.4470 1 R . PTA005414_70
18 1.3792 2 1929.0 1092.0 .0469 .98 -1.47 .98 -1.65 -.2220 .26 1.00 59.3 59.7 .24 .48 1.3792 1 R . PTA005475_23
19 -.7236 2 1929.0 1740.0 .0730 .87 -2.53 .80 -3.27 -.1616 .21 1.00 90.2 88.7 .16 .29 -.7216 1 R . PTA005571_40
20 -1.1321 2 1929.0 1776.0 .0855 .99 -.08 .93 -.81 .0115 .18 1.00 92.1 92.1 .14 .27 -1.1291 1 R . PTA005601_70
21 -.7297 2 1929.0 1711.0 .0741 1.03 .50 1.08 1.21 .0101 .10 1.00 88.7 88.8 .16 .32 -.7277 1 R . PTA005603_60
22 -1.7209 2 1929.0 1840.0 .1099 .99 -.08 .91 -.78 .0132 .17 1.00 95.4 95.4 .11 .21 -1.7155 1 R . PTA005610_23
23 .8286 2 1929.0 1304.0 .0488 .94 -3.86 .93 -3.90 -.1693 .27 1.00 69.1 65.3 .23 .45 .8289 1 R . PTA005690_33
24 -.6034 2 1929.0 1677.0 .0711 1.03 .70 1.05 .81 .0541 .17 1.00 86.9 87.5 .16 .33 -.6017 1 R . PTA005920_31
25 -.6868 2 1929.0 1675.0 .0755 1.10 1.82 1.06 .90 .1475 .21 1.00 86.9 88.4 .16 .33 -.6849 1 R . PTA006330_11
26 -.0967 2 1929.0 1596.0 .0593 .98 -.50 1.06 1.28 -.1148 .07 1.00 82.5 81.2 .19 .38 -.0957 1 R . PTA006348_32
27 .2022 2 1929.0 1474.0 .0554 1.02 .77 1.03 1.01 -.0044 .16 1.00 76.4 76.5 .21 .42 .2029 1 R . PTA006627_13
28 -1.2412 2 1929.0 1780.0 .0930 1.08 1.09 1.16 1.78 .0919 .10 1.00 92.3 92.9 .13 .26 -1.2379 1 R . PTA006630_11
29 -1.5136 2 1929.0 1821.0 .1004 .98 -.21 .84 -1.63 .0128 .21 1.00 94.4 94.5 .12 .22 -1.5093 1 R . PTA006738_13
30 1.9252 2 1929.0 751.0 .0491 1.04 2.98 1.05 2.79 -.0087 .16 1.00 60.2 63.6 .24 .48 1.9250 1 R . PTA006924_31
31 -.2486 2 1929.0 1547.0 .0669 1.16 3.88 1.20 3.91 .2150 .15 1.00 80.4 83.3 .18 .40 -.2474 1 R . PTA007213_60
32 .1530 2 1929.0 1487.0 .0555 1.00 -.01 .98 -.53 .0054 .22 1.00 77.2 77.3 .20 .41 .1538 1 R . PTA007275_35
33 .7798 2 1929.0 1324.0 .0491 .87 -7.76 .85 -7.74 -.1709 .39 1.00 72.1 66.2 .23 .43 .7801 1 R . PTA007397_35
34 -1.3984 2 1929.0 1804.0 .0979 1.05 .62 1.06 .69 .0561 .12 1.00 93.5 93.8 .12 .24 -1.3945 1 R . PTA007464_29
35 .1500 2 1929.0 1488.0 .0567 1.04 1.44 1.06 1.70 .0053 .12 1.00 77.1 77.3 .20 .42 .1508 1 R . PTA007703_29
36 1.6427 2 1929.0 875.0 .0474 1.01 .95 1.02 1.53 -.0062 .22 1.00 59.9 60.5 .24 .49 1.6426 1 R . PTA007725_24
37 1.1037 2 1929.0 1164.0 .0476 1.01 .46 1.00 .06 -.1101 .20 1.00 60.3 61.6 .24 .48 1.1039 1 R . PTA007854_29
38 1.0442 2 1929.0 1067.0 .0492 1.06 4.81 1.07 4.39 .1684 .17 1.00 57.6 62.2 .24 .49 1.0444 1 R . PTA007975_24
39 -.5309 2 1929.0 1671.0 .0682 .96 -.74 .87 -2.30 .0091 .28 1.00 86.7 86.7 .17 .33 -.5293 1 R . PTA008026_23
40 -2.1927 2 1929.0 1879.0 .1361 .87 -1.03 .68 -2.43 -.1197 .17 1.00 97.4 97.1 .09 .16 -2.1841 1 R . PTA008392_11
41 .7410 2 1929.0 1274.0 .0494 .97 -1.95 .95 -2.26 -.0069 .29 1.00 67.5 66.8 .23 .45 .7414 1 R . PTA008409_32
42 1.7573 2 1929.0 756.0 .0485 1.05 3.77 1.06 3.81 .1468 .10 1.00 59.0 61.5 .24 .49 1.7571 1 R . PTA008437_35
43 .2792 2 1929.0 1423.0 .0552 1.05 1.88 1.05 1.55 .0673 .18 1.00 73.8 75.1 .21 .43 .2799 1 R . PTA008511_13
44 1.3313 2 1929.0 1127.0 .0472 1.01 .79 1.01 .86 -.2530 .20 1.00 59.9 59.9 .24 .49 1.3313 1 R . PTA008589_25
45 .0062 2 1929.0 1539.0 .0576 1.00 -.06 1.00 .10 -.0145 .18 1.00 79.4 79.6 .20 .39 .0071 1 R . PTA008618_29
46 1.0992 2 1929.0 1048.0 .0484 1.04 3.00 1.04 2.93 .1555 .21 1.00 58.7 61.6 .24 .49 1.0994 1 R . PTA008673_80
47 .5351 2 1929.0 1352.0 .0522 1.04 2.06 1.06 2.20 .0027 .14 1.00 69.7 70.5 .22 .46 .5356 1 R . PTA008822_33
48 -1.2679 1 320.0 298.0 .2238 1.00 .05 .88 -.50 .0014 .18 1.00 93.1 93.1 .14 .25 -1.2473 1 R . PTA008843_24_PTA06801
49 1.8551 1 328.0 133.0 .1155 .99 -.41 .98 -.40 .0020 .25 1.00 64.3 62.2 .23 .47 1.8539 1 R . PTA008851_24_PTA06802
50 1.4776 2 1929.0 879.0 .0469 .93 -6.70 .92 -6.57 .1495 .37 1.00 65.5 59.7 .24 .47 1.4776 1 R . PTA008879_26
;;;;
data check;
set want;
if entry in (1,48);
run;
proc print data=check noobs;
run;
Result:
1 | 0.3771 | 2 | 1929 | 1369 | 0.0530 | 1.01 | 0.48 | 0.98 | -0.59 | 0.1168 | 0.30 | 1 | 71.3 | 73.4 | 0.21 | 0.43 | 0.3777 | 1 | R | . | PTA001830_50 |
48 | -1.2679 | 1 | 320 | 298 | 0.2238 | 1.00 | 0.05 | 0.88 | -0.50 | 0.0014 | 0.18 | 1 | 93.1 | 93.1 | 0.14 | 0.25 | -1.2473 | 1 | R | . | PTA008843_24_PTA06801 |
In your example every value has something, even if it is just a period. So simple list mode input will work.
Note you do NOT want to add a decimal part to an INFORMAT (like the 8.4 in your original code) unless you know that the decimal has been deliberately eliminated from the text to save a character. Otherwise values without decimal points will be divided by the power of ten to add in the implied decimal point before the last D characters.
So assuming that only the last variable is the only character variable (and that those "R" values in variable M is some type of missing indication) then add the missing statement to let SAS know to read the letter R as meaning the special missing value of .R.
So your program could be as simple as this:
missing R ;
data want;
infile 'X:\aaaa\bbb\ccc\Winst' firstobs=3 truncover;
input
ENTRY MEASURE ST COUNT SCORE REALSE IN_MSQ IN_ZST OUT_MS OUT_ZS
DISPL PTMA WEIGHT OBSMA EXPMA PMA_E RMSR WMLE G M R
NAME :$20.
;
run;
Results:
122 data _null_; 123 set want; 124 put (_n_ _all_) (=); 125 run; _N_=1 ENTRY=1 MEASURE=0.3771 ST=2 COUNT=1929 SCORE=1369 REALSE=0.053 IN_MSQ=1.01 IN_ZST=0.48 OUT_MS=0.98 OUT_ZS=-0.59 DISPL=0.1168 PTMA=0.3 WEIGHT=1 OBSMA=71.3 EXPMA=73.4 PMA_E=0.21 RMSR=0.43 WMLE=0.3777 G=1 M=R R=. NAME=ABC001830_50 _N_=2 ENTRY=2 MEASURE=-0.3008 ST=2 COUNT=1929 SCORE=1556 REALSE=0.066 IN_MSQ=1.09 IN_ZST=2.2 OUT_MS=0.98 OUT_ZS=-0.38 DISPL=0.2365 PTMA=0.35 WEIGHT=1 OBSMA=80.6 EXPMA=84 PMA_E=0.18 RMSR=0.38 WMLE=-0.2995 G=1 M=R R=. NAME=ABC002178_12 _N_=3 ENTRY=3 MEASURE=0.5741 ST=2 COUNT=1929 SCORE=1336 REALSE=0.0509 IN_MSQ=1.01 IN_ZST=0.5 OUT_MS=1.01 OUT_ZS=0.35 DISPL=0.0051 PTMA=0.21 WEIGHT=1 OBSMA=69.4 EXPMA=69.8 PMA_E=0.22 RMSR=0.45 WMLE=0.5746 G=1 M=R R=. NAME=ABC003680_32
If you have some values that are only represented by spaces or if some values are placed right next to each other without any spaces between them then use column mode. You will need check the start/end columns for each variable. That is easy to do using the regular old PROGRAM EDITOR in SAS Display Manager. Or by using the LIST statement in a data step.
91 data _null_; 92 infile dat firstobs=2 obs=3; 93 input; 94 list; 95 run; NOTE: The infile DAT is: Filename=... RECFM=V,LRECL=32767,File Size (bytes)=595, Last Modified=29Aug2020:17:05:27, Create Time=29Aug2020:17:05:27 RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 2 ;ENTRY MEASURE ST COUNT SCORE REALSE IN.MSQ IN.ZST OUT.MS OUT.ZS DISPL PTMA WEIGHT OBSMA EXP 101 MA PMA-E RMSR WMLE G M R NAME 133 3 1 .3771 2 1929.0 1369.0 .0530 1.01 .48 .98 -.59 .1168 .30 1.00 71.3 73 101 .4 .21 .43 .3777 1 R . ABC001830_50 141 NOTE: 2 records were read from the infile DAT. The minimum record length was 133. The maximum record length was 141.
So your input statement might look like:
input
ENTRY 1-6 MEASURE 7-14 ST 15-17 COUNT 18-25
...
NAME $ 130-141
;
I just pulled the data into Enterprise Guide and clicked on "Import Data". EG, then generated the following program:
/* --------------------------------------------------------------------
Code generated by a SAS task
Generated on Friday, August 28, 2020 at 11:03:36 PM
By task: Import Data Wizard
Source file:
C:\Users\jbarbou3\Documents\SAS\Pgm\Training\Community\Superbug_Dat
a.txt
Server: Local File System
Output data: WORK.Superbug_Data
Server: Local
-------------------------------------------------------------------- */
DATA WORK.Superbug_Data;
LENGTH
ENTRY 8
MEASURE 8
ST 8
COUNT 8
SCORE 8
REALSE 8
IN_MSQ 8
IN_ZST 8
OUT_MS 8
OUT_ZS 8
DISPL 8
PTMA 8
WEIGHT 8
OBSMA 8
EXPMA 8
PMA_E 8
RMSR 8
WMLE 8
G 8
M $ 2
R 8
NAME $ 13 ;
LABEL
IN_MSQ = "IN.MSQ"
IN_ZST = "IN.ZST"
OUT_MS = "OUT.MS"
OUT_ZS = "OUT.ZS"
PMA_E = "PMA-E" ;
FORMAT
ENTRY BEST5.
MEASURE BEST8.
ST BEST3.
COUNT BEST8.
SCORE BEST9.
REALSE BEST7.
IN_MSQ BEST7.
IN_ZST BEST7.
OUT_MS BEST7.
OUT_ZS BEST7.
DISPL BEST7.
PTMA BEST7.
WEIGHT BEST7.
OBSMA BEST6.
EXPMA BEST6.
PMA_E BEST6.
RMSR BEST6.
WMLE BEST8.
G BEST2.
M $CHAR2.
R BEST2.
NAME $CHAR13. ;
INFORMAT
ENTRY BEST5.
MEASURE BEST8.
ST BEST3.
COUNT BEST8.
SCORE BEST9.
REALSE BEST7.
IN_MSQ BEST7.
IN_ZST BEST7.
OUT_MS BEST7.
OUT_ZS BEST7.
DISPL BEST7.
PTMA BEST7.
WEIGHT BEST7.
OBSMA BEST6.
EXPMA BEST6.
PMA_E BEST6.
RMSR BEST6.
WMLE BEST8.
G BEST2.
M $CHAR2.
R BEST2.
NAME $CHAR13. ;
INFILE 'C:\Users\jbarbou3\Documents\SAS\Pgm\Training\Community\Superbug_Data.txt'
LRECL=512
FIRSTOBS=3
ENCODING="WLATIN1"
TRUNCOVER ;
INPUT
@2 ENTRY ?? BEST5.
@7 MEASURE ?? COMMA8.
@15 ST ?? BEST3.
@18 COUNT ?? COMMA8.
@26 SCORE ?? COMMA9.
@35 REALSE ?? COMMA7.
@42 IN_MSQ ?? COMMA7.
@49 IN_ZST ?? COMMA7.
@56 OUT_MS ?? COMMA7.
@63 OUT_ZS ?? COMMA7.
@70 DISPL ?? COMMA7.
@77 PTMA ?? COMMA7.
@84 WEIGHT ?? COMMA7.
@91 OBSMA ?? COMMA6.
@97 EXPMA ?? COMMA6.
@103 PMA_E ?? COMMA6.
@109 RMSR ?? COMMA6.
@115 WMLE ?? COMMA8.
@123 G ?? BEST2.
@125 M $CHAR2.
@127 R ?? BEST2.
@129 NAME $CHAR13. ;
RUN;
Everything looks fine when I look at the data in the "Output Data" tab. Why don't you try running that?
Jim
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.