BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
SeaMoon_168
Obsidian | Level 7

I used function put to convert a variable from num to char and then save the data as csv. However, when I import the saved csv data I found this variable became back to numeric again. I wonder how to keep the variable as char in the csv data?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

@SeaMoon_168 wrote:

I used function put to convert a variable from num to char and then save the data as csv. However, when I import the saved csv data I found this variable became back to numeric again. I wonder how to keep the variable as char in the csv data?


Note that a CSV file does not have anything but text in it. It is a text file after all. The only metadata it has is the optional first row can contain column headers which can be used to guess names to use for the variables.

 

But PROC IMPORT does have a nasty habit you can could possibly take advantage of.  In a CSV file you need to add quotes around values that have the delimiter (comma usually) to make it so the line can be parsed properly. So quotes around values are ignored, not treated as part of the value.  But PROC IMPORT keeps track of if every value in a column had quotes around it.  In that case it will assume it should read the column into a character variable, even if the column only contains values that are valid character representations of numbers.

 

So write the CSV with those unneeded quotes around every value.  The PUT statement modifier ~ will add quotes.

 

Example:

Let's make a CSV file where every value of a numeric variable has those extra quotes.

filename csv temp;
data _null_;
  set sashelp.class (obs=10) ;
  file csv dsd ;
  if _n_=1 then put 'name,age,height';
  put name age ~ height;
run;

Result

name,age,height
Alfred,"14",69
Alice,"13",56.5
Barbara,"13",65.3
Carol,"14",62.8
Henry,"14",63.5
James,"12",57.3
Jane,"12",59.8
Janet,"15",62.5
Jeffrey,"13",62.5
John,"12",59

Now let's ask PROC IMPORT to guess how to read that file into a SAS dataset.

proc import file=csv dbms=csv out=want replace;
run;

Result:

            Variables in Creation Order

#    Variable    Type    Len    Format     Informat

1    name        Char      7    $7.        $7.
2    age         Char      4    $4.        $4.
3    height      Num       8    BEST12.    BEST32.

To see why I called it a nasty habit try exporting a file with numeric variable whose values when displayed include commas and the using PROC IMPORT to read it back.

Example:

data have;
  set sashelp.cars (obs=10);
  keep make model msrp invoice;
run;
filename csv temp; proc contents data=have varnum;run; proc export dbms=csv data=have file=csv replace; run; proc import dbms=csv file=csv out=want replace; run; proc contents data=want varnum; run; proc compare data=have compare=want; run;

result:

Listing of Common Variables with Conflicting Types

Variable  Dataset    Type  Length  Format    Informat

MSRP      WORK.HAVE  Num        8  DOLLAR8.
          WORK.WANT  Char       9  $9.       $9.
Invoice   WORK.HAVE  Num        8  DOLLAR8.
          WORK.WANT  Char       9  $9.       $9.

View solution in original post

6 REPLIES 6
Patrick
Opal | Level 21

csv files are just textual and can't store information like data types.

When using Proc Import SAS will guess the data type based on the text if finds in the .csv ....and this is why you will find many discussions here where people state that you need to use a SAS data step with Infile/Input to have full control over how textual data gets mapped into SAS variables. 

Only "lazy" way to write such a data step: Use Proc Import (or the EG/Studio import wizard) and then copy/paste the generated data step code into a code window and amend it to what you really need.

 

I never tried it but it might be possible that if you create a .csv with your numbers in double quotes then SAS will import them a strings. ...but again: Writing your own data step to read the data will lead to much more stable and predictable outcomes.

Kurt_Bremser
Super User

Define "import". Into which application, and by which method?

 

A CSV file is just text, with no information about column metadata beyond the column names in row one. So this metadata needs to be defined in the target application when a CSV file is read.

 

If you read CSV files into SAS, you define the variable types and other attributes in the DATA step that reads the file, through a LENGTH, FORMAT and INFORMAT statements (the latter can also be done through the use of informats in the INPUT statement).

ballardw
Super User

Code.

The code you used both ways might be helpful.

 

Proc IMPORT, if that is what you used to "import" data Guesses as the variable type. If all the values in a column hold nothing but digits than the guess will tend to be numeric.

If your CSV file has a column with a mix of all digits and some digit/letter combinations then the appearance of the values in the first few rows is used to set the variable type. The Proc Import option GUESSINGROWS tells the procedure how many rows to examine before setting the type (and length) of variables. It is a good idea to use GUESSINGROWS=MAX so more of the file is examined.

 

Also since we are discussing CSV, do not allow spreadsheet software to open the file if you expect to read it using SAS later. Spreadsheets are notorious for 1) applying their own rules as to what type a CELL (not variable, not column) should be and 2)if you save a CSV it may well change values.

Tom
Super User Tom
Super User

@SeaMoon_168 wrote:

I used function put to convert a variable from num to char and then save the data as csv. However, when I import the saved csv data I found this variable became back to numeric again. I wonder how to keep the variable as char in the csv data?


Note that a CSV file does not have anything but text in it. It is a text file after all. The only metadata it has is the optional first row can contain column headers which can be used to guess names to use for the variables.

 

But PROC IMPORT does have a nasty habit you can could possibly take advantage of.  In a CSV file you need to add quotes around values that have the delimiter (comma usually) to make it so the line can be parsed properly. So quotes around values are ignored, not treated as part of the value.  But PROC IMPORT keeps track of if every value in a column had quotes around it.  In that case it will assume it should read the column into a character variable, even if the column only contains values that are valid character representations of numbers.

 

So write the CSV with those unneeded quotes around every value.  The PUT statement modifier ~ will add quotes.

 

Example:

Let's make a CSV file where every value of a numeric variable has those extra quotes.

filename csv temp;
data _null_;
  set sashelp.class (obs=10) ;
  file csv dsd ;
  if _n_=1 then put 'name,age,height';
  put name age ~ height;
run;

Result

name,age,height
Alfred,"14",69
Alice,"13",56.5
Barbara,"13",65.3
Carol,"14",62.8
Henry,"14",63.5
James,"12",57.3
Jane,"12",59.8
Janet,"15",62.5
Jeffrey,"13",62.5
John,"12",59

Now let's ask PROC IMPORT to guess how to read that file into a SAS dataset.

proc import file=csv dbms=csv out=want replace;
run;

Result:

            Variables in Creation Order

#    Variable    Type    Len    Format     Informat

1    name        Char      7    $7.        $7.
2    age         Char      4    $4.        $4.
3    height      Num       8    BEST12.    BEST32.

To see why I called it a nasty habit try exporting a file with numeric variable whose values when displayed include commas and the using PROC IMPORT to read it back.

Example:

data have;
  set sashelp.cars (obs=10);
  keep make model msrp invoice;
run;
filename csv temp; proc contents data=have varnum;run; proc export dbms=csv data=have file=csv replace; run; proc import dbms=csv file=csv out=want replace; run; proc contents data=want varnum; run; proc compare data=have compare=want; run;

result:

Listing of Common Variables with Conflicting Types

Variable  Dataset    Type  Length  Format    Informat

MSRP      WORK.HAVE  Num        8  DOLLAR8.
          WORK.WANT  Char       9  $9.       $9.
Invoice   WORK.HAVE  Num        8  DOLLAR8.
          WORK.WANT  Char       9  $9.       $9.

SeaMoon_168
Obsidian | Level 7

Thank you for your help. I did use R to finish my work. But I appreciate your advice on how to work on it using SAS. 

Tom
Super User Tom
Super User

How would R help?

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 317 views
  • 1 like
  • 5 in conversation