BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
laurenekerr
Calcite | Level 5

I am trying to import uncommon data that has two records per observation. SAS is reading the list vertically and not observing the delimiter. How can I fix this to import the data correctly?

 

I have attached the data I am trying to import. 

 

Here is my code:

 

 

data HypImpt.MS_Citizens;
    infile '/home/u63991182/BIOS 6680/Hypertension Study/Data/1_Source/MS Citizens.txt' N=2 dlmstr= "|" ;
    input
        SocSecNum  :$11. 
        FirstInit  :$2. 
        MiddleInit  :$2.
        LastInit :$2.
        CityState  :$35. 
        ZipCd     :$5.
        Gender  :$6.
        Eth.  :$22.          
        Racial    :$16.  
        DOB :DATE9. ; 
    
run;

proc print data=HypImpt.MS_Citizens; 

run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Not a problem with the delimiter as not matching your code to the file well enough.

The N=2 making two lines available still needs to be told on the INPUT which LINE to read.

Plus you have an invalid variable name ETH. (with a period)

Your INPUT statement should look like:

    input
        SocSecNum  :$11. 
        FirstInit  :$2. 
        MiddleInit  :$2.
        LastInit :$2.
        CityState  :$35. 
        ZipCd     :$5.
       / Gender  :$6.
        Eth  :$22.          
        Racial    :$16.  
        DOB :DATE9. 
      ; 

The / says "read on next line" . Or instead of / you could use #2, which means "read on line 2 of the available lines".

 

If it is a single character I would prefer to use DLM instead of DLMSTR but not an error just a style choice.

 

 

View solution in original post

2 REPLIES 2
ballardw
Super User

Not a problem with the delimiter as not matching your code to the file well enough.

The N=2 making two lines available still needs to be told on the INPUT which LINE to read.

Plus you have an invalid variable name ETH. (with a period)

Your INPUT statement should look like:

    input
        SocSecNum  :$11. 
        FirstInit  :$2. 
        MiddleInit  :$2.
        LastInit :$2.
        CityState  :$35. 
        ZipCd     :$5.
       / Gender  :$6.
        Eth  :$22.          
        Racial    :$16.  
        DOB :DATE9. 
      ; 

The / says "read on next line" . Or instead of / you could use #2, which means "read on line 2 of the available lines".

 

If it is a single character I would prefer to use DLM instead of DLMSTR but not an error just a style choice.

 

 

Tom
Super User Tom
Super User

You probably just need to remove the N= option and the INPUT will jump to the next line when there is no more data available on the current line.  But that could be a problem if the last item on a line could be empty.

 

So if you are positive that the split between the lines always appears in the same place then just add a / to the INPUT to tell it when to move to the next line.  You could then add the TRUNCOVER option to prevent it from jumping if the last value on the line is empty.

 

I usually find it clearer to define the variables first with a LENGTH statement.  Then the INPUT statement can by much simpler.  And if you define the variables in the order they appear in the file you can use positional variable lists to make the INPUT statement shorter.

data want;
  infile cards dsd dlm='|' truncover ;
  length
    SocSecNum $11 FirstInit MiddleInit LastInit $2
    CityState $35 ZipCd $5
    Gender $6 Eth $22 Racial $16 DOB 8
  ;
  input SocSecNum -- ZipCd / Gender -- DOB ;
  informat DOB date.;
  format DOB date9.;
cards;     
111-22-3333|S.|Y.|B.|Blue Mountain, Mississippi|38610                           
Male|Not Hispanic or Latino|African American|22JAN1974                          
444-55-6666|J.|G.|T.|Gulfport, Mississippi|39505                                
Male|Not Hispanic or Latino|Caucasian|22FEB1944                                 
777-88-9999|R.|Y.|H.|Gulfport, Mississippi|39505                                
Female|Not Hispanic or Latino|Caucasian|03FEB1985                               
999-88-7777|D.|W.|J.|Hollandale, Mississippi|38748                              
Female|Not Hispanic or Latino|Caucasian|22JAN1938 
;

Result

Tom_0-1727287832075.png

 

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 974 views
  • 0 likes
  • 3 in conversation