Help using Base SAS procedures

CSV Import Issue

Reply
Contributor
Posts: 29

CSV Import Issue

Hello.

I'm having an issue with importing a CSV file (I believe queried and given to me from a big database) into SAS when every string is in quotes as listed below

( "File","O,H,H","Yes")

However, if I open the CSV in excel and save over it, it results in the follow:

(File,"O,H,H",Yes)

and this imports just fine.

Is there a way to import without having excel involved?

Code:

Data test;

infile 'C:\Users\Daniel\Documents\Line.csv'

TERMSTR=CRLF

firstobs=4

dlm=','

dsd

missover

LRECL=6100

;

input

Case: $8.

Narrative: $4030.

Indication: $575.

Country: $4.

Report_Source: $12.

Image: $31.

Attach: $15.

;

run;

Super Contributor
Posts: 578

Re: CSV Import Issue

what is the issue in when importing directly from csv?

Contributor
Posts: 29

Re: CSV Import Issue

A dataset with 20,000 observation is only 19 observations long and fields with incorrect information in them.

Super Contributor
Posts: 578

Re: CSV Import Issue

My first thought is a different termstr value that is getting replaced when opened/saved in Excel.

Contributor
Posts: 29

Re: CSV Import Issue

Is there any other value other than: CRLF, CR, and LF?

Tried all three and none worked.  Perhaps the CSV file given to me is in a format SAS cannot read?

Super Contributor
Posts: 578

Re: CSV Import Issue

It's certainly possible to construct a file that uses something else.  At times, I've even had to open the file in a byte editor to look at what's actually there.

Contributor
Posts: 29

Re: CSV Import Issue

I see.

The only question I have left is, which is really just asking the question again:

Can SAS import a CSV where all the strings are in quotation marks.

ex.

"103943","O,H,J","File","Patient blah, blah, blah"

Super Contributor
Posts: 578

Re: CSV Import Issue

That should not be an issue.  You can prove that by creating a sample file and trying it. 

Super User
Posts: 17,784

Re: CSV Import Issue

Sorry, remove the dlm=',', the dad option should cover that.

Super User
Posts: 17,784

Re: CSV Import Issue

Also, you usually need truncover instead of missover.

Contributor
Posts: 29

Re: CSV Import Issue

Thank you for your input.

I tried all these options with no success.

I'm pretty convinced it's the file itself.  Perhaps the CSV file is coded differently?

I tried creating a sample csv, and it works fine.  However, this file doesn't seem to import.

Super User
Posts: 10,483

Re: CSV Import Issue

A dataset with 20,000 observation is only 19 observations long

This may mean that the data needs to have a fixed record length specified when reading.

PROC Star
Posts: 7,360

Re: CSV Import Issue

Can you post a "small" hexdump of the first few records from the file? Mark Terjeson posted the following macro on SAS-L some years back which makes it quite easy to see what a file "really" looks like. You may have to include a max parameter when calling the macro if your file's first 2 or 3 records (after the initial 4 records that your skipping) don't appear:

%macro hexdump(pathfile=,max=20000);

  data _null_;

    length chars $16;

    retain address 0 chars '................';

    infile "&pathfile" lrecl=1 recfm=f end=done;

    counter+1;

    if counter le &max. then do;

      input c $char1.;

      if mod(address,16) eq 0 then do;

        if _N_ eq 1 then put address hex8. 'h: ' @;

        else put ' ; ' chars / address hex8. 'h: ' @;

        chars = '................';

      end;

      put c hex2. ' ' @;

      if rank(c) ge 32 then substr(chars,mod(address,16)+1,1) = c;

      address + 1;

    end;

    if done or counter gt &max. then do;

         do i =mod(address,16)+1 to 16;

          put ' ' @;

        end;

        put ' ; ' chars / ;

      end;

  run;

%mend;

%hexdump(pathfile=C:\Users\Daniel\Documents\Line.csv)

Contributor
Posts: 29

Re: CSV Import Issue

Not sure if I did this correctly, but here it is:

00000000h: 0A 22 54 6F 74 61 6C 20 4E 75 6D 62 65 72 20 6F  ;

00000010h: 66 20 43 61 73 65 73 3A 22 2C 2C 22 32 32 2C 38  ;

00000020h: 39 31 22 0A 0A 22 43 61 73 65 20 23 22 2C 22 56  ;

00000030h: 72 73 6E 22 2C 22 46 44 41 20 49 6E 69 74 69 61  ;

00000040h: 6C 20 52 65 63 64 20 44 61 74 65 22 2C 22 46 44  ;

00000050h: 41 20 52 65 63 64 20 44 61 74 65 22 2C 22 41 6C  ;

00000060h: 6C 20 53 75 73 70 65 63 74 73 22 2C 22 50 72 69  ;

00000070h: 6D 61 72 79 20 53 75 73 70 65 63 74 20 28 50 53  ;

00000080h: 29 22 2C 22 50 53 20 44 6F 73 65 2D 52 6F 75 74  ;

00000090h: 65 2D 46 72 65 71 75 65 6E 63 79 22 2C 22 4F 75  ;

PROC Star
Posts: 7,360

Re: CSV Import Issue

You did it correctly, but didn't output enough characters for one to really know what the file structure might actually look like.

My first guess would be to try:

Data test;

infile 'C:\Users\Daniel\Documents\Line.csv'

firstobs=4

dlm='2C0A'x

dsd

missover

LRECL=6100

;

input

Case: $8.

Narrative: $4030.

Indication: $575.

Country: $4.

Report_Source: $12.

Image: $31.

Attach: $15.

;

run;

Ask a Question
Discussion stats
  • 14 replies
  • 343 views
  • 0 likes
  • 5 in conversation