BookmarkSubscribeRSS Feed
Ashwini_uci
Obsidian | Level 7

Thanks Tom, the second way (@98 '   ') worked.

After running the code, the LOG says

The minimum record length was 100.

The maximum record length was 100. But When I manually counted twice, they were total 101 positions.

So in -@98 '   '-, I removed one space, and ran the code again,

, so this time the LOG says,

The minimum record length was 99.

The maximum record length was 99. And WHen i counted manually (3-4 times), they were 100.

Which one should I go by? Is there any way I can get to show the numbe of positions?

Tom
Super User Tom
Super User

If it says the MIN and MAX length were both 100 then it worked.

If you are trying to count by moving the cursor you are probably moving the cursor past the end of the data on the line.

jakarman
Barite | Level 11

I use the courier monisoace font to show you the exec positioning of your data. I added I comment line showing the position. The list statement can be use analzing the input.

As it is longer as 80 columns the nocardimage option is needed. The leading zero-s are removed. They can be adjusted with a format.
Run this as exactly copy (no space removals) and you will see your desired results

options nocardimage;
data test;
input Last_Name $ 1-20 @21 First_Name $ 21-35 @36 Middle_Initial $ 36 @37 Social_Security_Number  37-45  @46Month_of_Birth  46-47 @48 Day_of_Birth 48-49 @50 Year_of_Birth 50-53 @54 Father_Surname $ 54-71  @72 Age_Unit_at_death_ 72 @73 Number_of_Age_Units_death 73-74 @75 Sex $ 75 @76 Race $ 76 @77 Marital_Status $ 77 @78 State_Residence $ 78-79 @80 State_Birth $ 80-81 @82 Control_ID_number 82-91  @92 Age_in_1991 92-97;
  put Last_Name $ 1-20 @21 First_Name $ 21-35 @36 Middle_Initial $ 36 @37 Social_Security_Number  37-45  @46Month_of_Birth  46-47 @48 Day_of_Birth 48-49 @50 Year_of_Birth 50-53 @54 Father_Surname $ 54-71  @72 Age_Unit_at_death_ 72 @73 Number_of_Age_Units_death 73-74 @75 Sex $ 75 @76 Race $ 76 @77 Marital_Status $ 77 @78 State_Residence $ 78-79 @80 State_Birth $ 80-81 @82 Control_ID_number 82-91  @92 Age_in_1991 92-97;
/* .+....1....+....2....+....3....+....4....+....5....+....6....+....7....+....8....+....9....+....0....+....1....+....2....+....3....+....4....+....5....+....6....+ */
datalines;
asdad               dasdad          12133131306181928                     M  CA  10001     62
adad                adad            12121212102231934                     M  CA  10002     56
adad                adad            02121212102231934                     M  CA  10002     56
bnvn                vnbvbn          02111121210271917                     F  CA  10003     72
rty                 lfghdh          32659854207311919                     M  CA  10008     71
sada                bnmb            32651252206261933                     M  CA  10010     57
sdfsf               ftgh            01020102007121920                     M  CA  10011     70
;
run;

proc print;
run;

---->-- ja karman --<-----
Ashwini_uci
Obsidian | Level 7

Thanks Jaap! I'll try this way as well. So the original data is in Excel.

So do you suggest that I just copy and paste it from excel sheet into the code above? I have 1500 records and I wonder how I could manually space out the data in the code as you have shown above.

jakarman
Barite | Level 11

When you import it from Excel (proc import) your variable alignment does not matter.

When you to test the data according NDI-s lay-out description it does. That is the layout (fixed columns) as shown.   

---->-- ja karman --<-----
Ashwini_uci
Obsidian | Level 7

Jaap, I tried this way too, and the SAS output file that get, is still the same as the 2nd PDF file attached to the original post. The data is not under the respective columns..

Ashwini_uci
Obsidian | Level 7

and the log shows the following notes..

NOTE: Invalid data for Social_Security_Number in line 23 37-45.

NOTE: Invalid data for Year_of_Birth in line 23 50-53.

NOTE: Invalid data for Control_ID_number in line 25 1-10.

NOTE: Invalid data for Age_in_1991 in line 26 1-6.

Brown Oliver 453406806 06 18 1928        . 1 0   .2                 . .     La         .

     .

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+

26  CHAR   Estrada.Maria..587375041.10.27.1917....F...CA..10003.72                        

    ZONE   47776660467660033333333303303303333000040004400333330332222222222222222222222222

    NUMR   53421419D12919958737504191092791917999969993199100039720000000000000000000000000

NOTE: Invalid data errors for file CARDS occurred outside the printed range.

2                                        The SAS System           15:31 Tuesday, August 19, 2014

NOTE: Increase available buffer lines with the INFILE n= option.

Last_Name=Brown Oliver 453406 First_Name=806 06 18 1928 Middle_Initial=

Social_Security_Number=. Month_of_Birth=1 Day_of_Birth=0 Year_of_Birth=. Father_Surname=2

Age_Unit_at_death_=. Number_of_Age_Units_death=. Sex=  Race=  Marital_Status=  State_Residence=

State_Birth=La Control_ID_number=. Age_in_1991=. _ERROR_=1 _N_=1

NOTE: Invalid data for Social_Security_Number in line 27 37-45.

NOTE: Invalid data for Year_of_Birth in line 27 50-53.

NOTE: Invalid data for Control_ID_number in line 29 1-10.

NOTE: Invalid data for Age_in_1991 in line 30 1-6.

Smith Rupert 499129756 07 31 1919        . 1 0   .1                 . .     Wi         .

     .

RULE:      ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+

30  CHAR   Jones.Emily..616387121.10.07.1922....F...CA..10012.68                          

    ZONE   46667046667003333333330330330333300004000440033333033222222222222222222222222222

    NUMR   AFE5395D9C9996163871219109079192299996999319910012968000000000000000000000000000

NOTE: Invalid data errors for file CARDS occurred outside the printed range.

NOTE: Increase available buffer lines with the INFILE n= option.

Last_Name=Smith Rupert 499129 First_Name=756 07 31 1919 Middle_Initial=

Social_Security_Number=. Month_of_Birth=1 Day_of_Birth=0 Year_of_Birth=. Father_Surname=1

Age_Unit_at_death_=. Number_of_Age_Units_death=. Sex=  Race=  Marital_Status=  State_Residence=

State_Birth=Wi Control_ID_number=. Age_in_1991=. _ERROR_=1 _N_=2

NOTE: Invalid data for Social_Security_Number in line 31 37-45.

NOTE: Invalid data for Day_of_Birth in line 31 48-49.

NOTE: Invalid data for Year_of_Birth in line 31 50-53.

NOTE: Invalid data for Control_ID_number in line 33 1-10.

NOTE: Invalid data for Age_in_1991 in line 34 1-6.

Richard Stachowski 102262143 09 22        . . .   .0016 56           . .     Ki         .

     .

34  CHAR   Rivera.Aurora..564341130.11.13.1928....F...CA..10022.62                        

    ZONE   56767604776760033333333303303303333000040004400333330332222222222222222222222222

    NUMR   2965219152F219956434113091191391928999969993199100229620000000000000000000000000

NOTE: Invalid data errors for file CARDS occurred outside the printed range.

NOTE: Increase available buffer lines with the INFILE n= option.

Last_Name=Richard Stachowski First_Name=102262143 09 22 Middle_Initial=

Social_Security_Number=. Month_of_Birth=. Day_of_Birth=. Year_of_Birth=. Father_Surname=0016 56

Age_Unit_at_death_=. Number_of_Age_Units_death=. Sex=  Race=  Marital_Status=  State_Residence=

State_Birth=Ki Control_ID_number=. Age_in_1991=. _ERROR_=1 _N_=3

NOTE: Invalid data for Social_Security_Number in line 35 37-45.

NOTE: Invalid data for Year_of_Birth in line 35 50-53.

NOTE: Invalid data for Control_ID_number in line 37 1-10.

NOTE: Invalid data for Age_in_1991 in line 38 1-6.

Ogden Martha 243600388 08 09 1934        . 1 0   .1                 . .     Ra         .

     .

38

jakarman
Barite | Level 11

Ashwini you have tabs '09'x in your file. What you see is not what is in the file. Adjust the tabs to spaces so the layout in columns is exactly as I have shown in the example.

    CHAR   Rivera.Aurora..564341130.11.13.1928....F...CA..10022.62                       

    ZONE   56767604776760033333333303303303333000040004400333330332222222222222222222222222

    NUMR   2965219152F219956434113091191391928999969993199100229620000000000000000000000000

With the tabs between ddmmyyyy it looks to be a tab-delimited file. That is another way of coding the data instead of fixed  columns
Read that one as should be.

---->-- ja karman --<-----
Ashwini_uci
Obsidian | Level 7

So I used the following code and the text file was created. Now the last issue (hopefully) I am facing here is how to make sure that there are 100 positions per record.

Where can I mention that in the code about the number of positions? I counted manually the spaces in the records; they are 99. Any suggestions?

filename NDItext 'C:\Users\SBHW\SBHWtext_1.txt';

data _null_;

file nditext;

set sbhw.SBHW_SAS1;

put @1 Last_Name

      @21 First_Name

      @36 Middle_Initial

      @37 SSN

      @46 Month

      @48 Day

      @50 Year

      @54 Father_Surname

      @72 Age_at_death

      @73 Number_of_Age_Units

      @75 Sex

      @76 Race

      @77 Marital_Status

      @78 State_of_Residence

      @80 State_of_Birth

      @82 Control_ID

      @92 Age_in_1991

   @98 Blank_field;

run;

Tom
Super User Tom
Super User

What system are you using to "count" the number of characters?

I suspect that you are writing the file using a Windows machine and so the at the end of the line SAS is writing the Windows standard Carriage Return and Line Feed characters.  If you then look at that file on a Unix machine where the normal end of line is marked only be a Line Feed character the Carriage Return appears to be an extra character.

Try adding TERMSTR=LF to the FILE statement.

Ashwini_uci
Obsidian | Level 7

I am using a windows machine and I am just counting each charater/space by moving the cursor using arrow key on the keyboard.

But let me try the option you've suggested.

I also think I sould go by what the SAS LOG says after it generates the TEXT file. If it says record length is 100, then 100 it is, I guess.

Ashwini_uci
Obsidian | Level 7

I used the TERMSTR=LF option, but it placed the records next to each other instead of in a table format. So it doesn't really help.

ballardw
Super User

How long is Blank_field. If it is starting at 98 and you want 100 for line length it should contain 3 spaces to occupy columns 98,99, and 100.

Ashwini_uci
Obsidian | Level 7

Yes, it is 3 spaces.

When I used 3 spaces in the code. (@98 '   ') ,

after running the code, the SAS LOG says

The minimum record length was 100.

The maximum record length was 100. But When I manually counted the spaces in the record twice, they were total 101 positions.

So I removed one space from in @98 '   ', and ran the code again, so this time the LOG says,

The minimum record length was 99.

The maximum record length was 99. And WHen i counted manually (3-4 times), they were 100.

SO I am not sure why there is a diff of one space in what the Log says and my manually counting the spaces. I guess I should go by what the SAS log says ?

And so i wondered, Is there any way I can get to show the number of positions

jakarman
Barite | Level 11

You cannot see them with the normal tools. A tool like notepad++ has an option to "view - show symbols" and encoding. With that spaces are becoming like dots tabs arrows etc.

---->-- ja karman --<-----

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 31 replies
  • 3163 views
  • 2 likes
  • 4 in conversation