BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Reddi
Fluorite | Level 6

Hi I'm getting weird output after running the below program. Basically I've excel CSV file and importing the data using infile with defined variables. But I'm getting the comma in the data values as mentioned below.

 

data A_dsn;

infile filename delimiter = ' ' truncover DSD lrecl=32767 firstobs=2;

input Claim_Reference $ 15. Sent_From $ 50. Sent_To $ 50. Subject $ 50. Document_Name $ 50.;

run;

 

referencefromtosubjectDocument_name
454/154963,example@mail.com,example1@mail.com,sample subject,sample document name

 

Is there any chance I can improve the program ?

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

If you use the right delimiter, bang! - it works:

data want;
infile '$HOME/sascommunity/sample.csv' dlm=',' truncover firstobs=2;
input
  claim_ref :$15.
  sent_from :$50.
  sent_to :$50.
  subject :$30.
  doc_name :$24.
;
run;

proc print data=want noobs;
run;

Result:

  claim_ref                      sent_from                              sent_to

AA/1/  /123456    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    claimsdocumentation@aaaa.co.uk
AA/1/BB/234567    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    Mashh.aaaaa@aaaaaaaaa.com     

           subject                        doc_name

AAAAAAA Claim Reference AA/1/     I12345-01.12.17-05:43:01
AAAAAAA Claim Reference AA/1/B    I23456-01.12.17-05:45:09

You might consider post-processing to get the timestamps out of doc_name, for instance.

View solution in original post

11 REPLIES 11
Damo
SAS Employee

Hi @Reddi

 

Can you share a portion of your csv file?

 

cheers,
Damo

Reddi
Fluorite | Level 6

Hi @Damo,

 

You can use the file attached.

 

Thanks,

redid.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Why are you reading a Comma Speparated Variable file, and yet setting the delimiter to a space?  The delimiter in a CSV is a comma - its in the name!  The simplest method to import a CSV is via proc import, however that removes some of the control.  Better to read in via a datastep as you are doing, and specify what the input/output formats should be, lengths, labels etc.  This gives you full control over the import and makes it repeatable (as proc import is a guessing procedure).

Reddi
Fluorite | Level 6
Hi @RW9,

That was just a trial to give space in the delimiter but no use.
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Then what is the question?  I have answered the question as presented in your first post which stated:

"Hi I'm getting weird output after running the below program. Basically I've excel CSV file and importing the data using infile with defined variables. But I'm getting the comma in the data values as mentioned below."

 

So what would you like us to answer?

Reddi
Fluorite | Level 6

Hi @RW9,

 

I agree but  I'm getting comma(,) along with actual values and I want to get rid of this and expecting only values.

 

Regards,

Reddi.

andreas_lds
Jade | Level 19

Post the code you are actually using. In the code you have posted using the wrong delimiter is most likely causing the problem.

RW9
Diamond | Level 26 RW9
Diamond | Level 26
infile filename delimiter = ' ' truncover DSD lrecl=32767 firstobs=2;

                                   ^                             ^

 

I highlight the issue above, a delimeter tells SAS where to start and stop reading the data, and drops the delimiter.  If you specify space as delimiter, then SAS will start and stop reading each time it encounters a space, and will drop the space.  Therefore if you read the file in with delimiter=' ', commas will not be automatically dropped as being the delimiter, only space.  You can correct this action by presenting the correct delimiter which is a comma:

infile filename delimiter = ',' truncover DSD lrecl=32767 firstobs=2;
Kurt_Bremser
Super User

If you use the right delimiter, bang! - it works:

data want;
infile '$HOME/sascommunity/sample.csv' dlm=',' truncover firstobs=2;
input
  claim_ref :$15.
  sent_from :$50.
  sent_to :$50.
  subject :$30.
  doc_name :$24.
;
run;

proc print data=want noobs;
run;

Result:

  claim_ref                      sent_from                              sent_to

AA/1/  /123456    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    claimsdocumentation@aaaa.co.uk
AA/1/BB/234567    AAAAAAAmotorserviceclaims@AAAAAAA.co.uk    Mashh.aaaaa@aaaaaaaaa.com     

           subject                        doc_name

AAAAAAA Claim Reference AA/1/     I12345-01.12.17-05:43:01
AAAAAAA Claim Reference AA/1/B    I23456-01.12.17-05:45:09

You might consider post-processing to get the timestamps out of doc_name, for instance.

Reddi
Fluorite | Level 6

Right on Sir,

 

Thanks for the support everyone.

 

Regards,

Reddi.

 

Tom
Super User Tom
Super User

Don't use formatted input with a delimited file. Use list mode input instead. You can either define your variables first (always a good habit to get into) and then just list them in the INPUT statement. 

data a_dsn;
  infile filename truncover DSD lrecl=32767 firstobs=2;
  length Claim_Reference $15 Sent_From $50 Sent_To $50 Subject $50 Document_Name $50;
  input Claim_Reference -- Document_Name ;
run;

Or add the : (colon) modifier before any informats that you have in-line in the INPUT statement. That way you are still using list mode input even though you have included informats.  And the presence of the informats at the first reference to the variable names will let SAS guess that you wanted to define the variables with a length that matches the width of the informat specification instead of using  the default length of $8 for character variables.

data a_dsn;
  infile filename truncover DSD lrecl=32767 firstobs=2;
  input Claim_Reference :$15. Sent_From :$50. Sent_To :$50. Subject :$50. Document_Name :$50;
run;

Also note that the $ is part of the informat name.  If you code it the way you have you are telling SAS to read a character variable, because of the stand alone $, and then also telling it to use a numeric format.  SAS will kindly recognize that it cannot read a character variable using the 10. informat and so will convert it to use the $10. informat instead.  But other places where you can specify an informat will not be so accepting of the extra space.

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 2597 views
  • 2 likes
  • 6 in conversation