DATA Step, Macro, Functions and more

CSV Import(, at the beginning of the values?)

Accepted Solution Solved
Reply
Contributor
Posts: 29
Accepted Solution

CSV Import(, at the beginning of the values?)

Hi I'm getting weird output after running the below program. Basically I've excel CSV file and importing the data using infile with defined variables. But I'm getting the comma in the data values as mentioned below.

 

data A_dsn;

infile filename delimiter = ' ' truncover DSD lrecl=32767 firstobs=2;

input Claim_Reference $ 15. Sent_From $ 50. Sent_To $ 50. Subject $ 50. Document_Name $ 50.;

run;

 

referencefromtosubjectDocument_name
454/154963,example@mail.com,example1@mail.com,sample subject,sample document name

 

Is there any chance I can improve the program ?


Accepted Solutions
Solution
‎01-04-2018 12:43 AM
Super User
Posts: 9,563

Re: CSV Import(, at the beginning of the values?)

If you use the right delimiter, bang! - it works:

data want;
infile '$HOME/sascommunity/sample.csv' dlm=',' truncover firstobs=2;
input
  claim_ref :$15.
  sent_from :$50.
  sent_to :$50.
  subject :$30.
  doc_name :$24.
;
run;

proc print data=want noobs;
run;

Result:

  claim_ref                      sent_from                              sent_to

AA/1/  /123456    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    claimsdocumentation@aaaa.co.uk
AA/1/BB/234567    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    Mashh.aaaaa@aaaaaaaaa.com     

           subject                        doc_name

AAAAAAA Claim Reference AA/1/     I12345-01.12.17-05:43:01
AAAAAAA Claim Reference AA/1/B    I23456-01.12.17-05:45:09

You might consider post-processing to get the timestamps out of doc_name, for instance.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
SAS Employee
Posts: 199

Re: CSV Import(, at the beginning of the values?)

Hi @Reddi

 

Can you share a portion of your csv file?

 

cheers,
Damo

Contributor
Posts: 29

Re: CSV Import(, at the beginning of the values?)

Hi @Damo,

 

You can use the file attached.

 

Thanks,

redid.

Super User
Super User
Posts: 9,203

Re: CSV Import(, at the beginning of the values?)

Why are you reading a Comma Speparated Variable file, and yet setting the delimiter to a space?  The delimiter in a CSV is a comma - its in the name!  The simplest method to import a CSV is via proc import, however that removes some of the control.  Better to read in via a datastep as you are doing, and specify what the input/output formats should be, lengths, labels etc.  This gives you full control over the import and makes it repeatable (as proc import is a guessing procedure).

Contributor
Posts: 29

Re: CSV Import(, at the beginning of the values?)

Hi @RW9,

That was just a trial to give space in the delimiter but no use.
Super User
Super User
Posts: 9,203

Re: CSV Import(, at the beginning of the values?)

Then what is the question?  I have answered the question as presented in your first post which stated:

"Hi I'm getting weird output after running the below program. Basically I've excel CSV file and importing the data using infile with defined variables. But I'm getting the comma in the data values as mentioned below."

 

So what would you like us to answer?

Contributor
Posts: 29

Re: CSV Import(, at the beginning of the values?)

Hi @RW9,

 

I agree but  I'm getting comma(,) along with actual values and I want to get rid of this and expecting only values.

 

Regards,

Reddi.

Super Contributor
Posts: 498

Re: CSV Import(, at the beginning of the values?)

Post the code you are actually using. In the code you have posted using the wrong delimiter is most likely causing the problem.

Super User
Super User
Posts: 9,203

Re: CSV Import(, at the beginning of the values?)

infile filename delimiter = ' ' truncover DSD lrecl=32767 firstobs=2;

                                   ^                             ^

 

I highlight the issue above, a delimeter tells SAS where to start and stop reading the data, and drops the delimiter.  If you specify space as delimiter, then SAS will start and stop reading each time it encounters a space, and will drop the space.  Therefore if you read the file in with delimiter=' ', commas will not be automatically dropped as being the delimiter, only space.  You can correct this action by presenting the correct delimiter which is a comma:

infile filename delimiter = ',' truncover DSD lrecl=32767 firstobs=2;
Solution
‎01-04-2018 12:43 AM
Super User
Posts: 9,563

Re: CSV Import(, at the beginning of the values?)

If you use the right delimiter, bang! - it works:

data want;
infile '$HOME/sascommunity/sample.csv' dlm=',' truncover firstobs=2;
input
  claim_ref :$15.
  sent_from :$50.
  sent_to :$50.
  subject :$30.
  doc_name :$24.
;
run;

proc print data=want noobs;
run;

Result:

  claim_ref                      sent_from                              sent_to

AA/1/  /123456    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    claimsdocumentation@aaaa.co.uk
AA/1/BB/234567    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    Mashh.aaaaa@aaaaaaaaa.com     

           subject                        doc_name

AAAAAAA Claim Reference AA/1/     I12345-01.12.17-05:43:01
AAAAAAA Claim Reference AA/1/B    I23456-01.12.17-05:45:09

You might consider post-processing to get the timestamps out of doc_name, for instance.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 29

Re: CSV Import(, at the beginning of the values?)

Posted in reply to KurtBremser

Right on Sir,

 

Thanks for the support everyone.

 

Regards,

Reddi.

 

Highlighted
Super User
Super User
Posts: 7,847

Re: CSV Import(, at the beginning of the values?)

[ Edited ]

Don't use formatted input with a delimited file. Use list mode input instead. You can either define your variables first (always a good habit to get into) and then just list them in the INPUT statement. 

data a_dsn;
  infile filename truncover DSD lrecl=32767 firstobs=2;
  length Claim_Reference $15 Sent_From $50 Sent_To $50 Subject $50 Document_Name $50;
  input Claim_Reference -- Document_Name ;
run;

Or add the : (colon) modifier before any informats that you have in-line in the INPUT statement. That way you are still using list mode input even though you have included informats.  And the presence of the informats at the first reference to the variable names will let SAS guess that you wanted to define the variables with a length that matches the width of the informat specification instead of using  the default length of $8 for character variables.

data a_dsn;
  infile filename truncover DSD lrecl=32767 firstobs=2;
  input Claim_Reference :$15. Sent_From :$50. Sent_To :$50. Subject :$50. Document_Name :$50;
run;

Also note that the $ is part of the informat name.  If you code it the way you have you are telling SAS to read a character variable, because of the stand alone $, and then also telling it to use a numeric format.  SAS will kindly recognize that it cannot read a character variable using the 10. informat and so will convert it to use the $10. informat instead.  But other places where you can specify an informat will not be so accepting of the extra space.

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 184 views
  • 2 likes
  • 6 in conversation