BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
DrAbhijeetSafai
Lapis Lazuli | Level 10

I have a comma separated value (CSV) file from which I am reading data in SAS. In one of the columns, the data is page numbers separated by comma. It appears something like - 23, 35, 67, 229 etc. 

 

Now the problem is that the file is comma separated and the text which I am trying to read is also comma separated. So when SAS encounters a comma, it takes the next value. (For this column it is creating a blank record). 

 

I would like to know how can I solve this problem. 

 

Thanks in advance!

 

- Dr. Abhijeet Safai 

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real
1 ACCEPTED SOLUTION

Accepted Solutions
DrAbhijeetSafai
Lapis Lazuli | Level 10

The problem is solved!

 

In short the mistake was - not using guessingrows = max option in proc import.

 

Now in detail:

As I had shared earlier my suspicion was that 'import not successful' and this issue are linked. They were! I removed the column of pages (which had numeric values separated by comma) and that solved the issue first and that gave me idea to use guessingrows = max in proc import.

 

The data was such that there were blanks at the beginning for this column named pages and then there were values. The values in earlier rows were numeric and hence SAS considered it as a numeric variable and when comma was encountered, it went ahead thinking that the value has been captured. That created issues in proc import also. 

 

Once guessingrows = max was used, SAS found out that it is a character variable and then things went ahead well - import was successful and column was read properly. 

 

Thanks for mentioning @Kurt_Bremser that "In a standard-conforming csv file, data containing the delimiter must be enclosed in quotes, which is then handled with the DSD option in SAS." That was a great insight which helped me to investigate further.

 

In the end, the code which worked was as follows:

 

filename vls "path..../sdtm_specs_variable_level_sheet.csv";

 

proc import datafile = vls out = vls1 dbms = csv replace;

     getnames = yes;

     guessingrows = max;

run;

 

As this is the answer I found working, I will mark it as the solution. 

 

Thank you.

 

- Dr. Abhijeet Safai

 

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

In a standard-conforming csv file, data containing the delimiter must be enclosed in quotes, which is then handled with the DSD option in SAS.

If you have a file which does not follow the rules, you may be able to solve it by counting the delimiters from right and left, leaving the faulty column in the middle. Please show some examples of what you have.

DrAbhijeetSafai
Lapis Lazuli | Level 10

Thanks @Kurt_Bremser  for the response. 

 

I think dsd option will solve the issue. However, I am using proc import to read the file. I am not sure how to use dsd option when using proc import. Because dsd is used for infile statement.

 

My program looks like this:

filename vls "path..../sdtm_specs_variable_level_sheet.csv";

 

proc import datafile = vls out = vls1 dbms = csv replace;

     getnames = yes;

run;

 

This sdtm_specs_variable_level_sheet.csv file is a csv file which I have created from excel file. 

 

Kindly let me know if any other information is needed.

 

Thanks once again for your response.

 

- Dr. Abhijeet Safai

 

 

 

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real
Kurt_Bremser
Super User

When you create a csv file from Excel without any special options, Excel will use semicolons as delimiter.

 

Open the csv file with an editor (not with Excel), and copy/paste a few lines into a window opened with the </> button.

 

Tom
Super User Tom
Super User

@Kurt_Bremser wrote:

When you create a csv file from Excel without any special options, Excel will use semicolons as delimiter.

 

Open the csv file with an editor (not with Excel), and copy/paste a few lines into a window opened with the </> button.

 


In the US EXCEL will use comma as the default delimiter.  EXCEL will only use semicolon as the default delimiter in locales where that is the standard.  Usually in countries that use comma as the decimal point when representing numbers.

DrAbhijeetSafai
Lapis Lazuli | Level 10

Hi @Kurt_Bremser , in the meanwhile I would like to share the the csv file is not imported successfully and there is an error message indicating that in the log. 

 

Maybe this issue is related to this unsuccessful import as well. However, the dataset is created and my work is done so I was ignoring that message but it seems that maybe I will need to fix that first and once that is fixed, maybe this problem will also be solved. 

 

I tried using termstr=crlf but that is not solving the issue. 

 

Thank you.

 

- Dr. Abhijeet Safai

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real
DrAbhijeetSafai
Lapis Lazuli | Level 10

The problem is solved!

 

In short the mistake was - not using guessingrows = max option in proc import.

 

Now in detail:

As I had shared earlier my suspicion was that 'import not successful' and this issue are linked. They were! I removed the column of pages (which had numeric values separated by comma) and that solved the issue first and that gave me idea to use guessingrows = max in proc import.

 

The data was such that there were blanks at the beginning for this column named pages and then there were values. The values in earlier rows were numeric and hence SAS considered it as a numeric variable and when comma was encountered, it went ahead thinking that the value has been captured. That created issues in proc import also. 

 

Once guessingrows = max was used, SAS found out that it is a character variable and then things went ahead well - import was successful and column was read properly. 

 

Thanks for mentioning @Kurt_Bremser that "In a standard-conforming csv file, data containing the delimiter must be enclosed in quotes, which is then handled with the DSD option in SAS." That was a great insight which helped me to investigate further.

 

In the end, the code which worked was as follows:

 

filename vls "path..../sdtm_specs_variable_level_sheet.csv";

 

proc import datafile = vls out = vls1 dbms = csv replace;

     getnames = yes;

     guessingrows = max;

run;

 

As this is the answer I found working, I will mark it as the solution. 

 

Thank you.

 

- Dr. Abhijeet Safai

 

Dr. Abhijeet Safai
Certified Base and Clinical SAS Programmer
Associate Data Analyst
Actu-Real

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 783 views
  • 4 likes
  • 3 in conversation