BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
BoboTheFool
Calcite | Level 5

I have a csv file which has 6 variables - Day, HIgh, Low, Precip, Snow, and SnowDepth.  In the CSV file the date displays as 1-Jan-2018, and if I make Day a character variable it reads in the same way, but I cannot do any manipulations with it, as it isn't in a SAS defined format.  Is there a way in SAS that I can change the format, or do I have to go to the CSV file and change all of my dates (although none of the options are SAS agreeable either)?

 

1-Jan-2018 -11.9 -24 0 0 2.99
2-Jan-2018 -0.9 -23.1 0 0 2.99

 

libname Adam '/home/adamrowland0';
data Adam.Final;
   infile "/home/adamrowland0/sasuser.v94/Stat 510 Final Data Set.csv" dsd;
   input Day $
   	 High 
   	 Low
   	 Precip  $
   	 Snow $
   	 SnowDepth $;

  
run;
 
Thank you for any help you can provide.
 
Adam
1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

Change your input statement:

input
  Day :date11.
  High
  Low
  Precip
  Snow
  SnowDepth
;

so that the dates are read correctly, and the numbers as numbers. Assign a date format to the day column.

Add a proper delimiter in the infile statement, if it is really a csv file (comma separated).

View solution in original post

6 REPLIES 6
Kurt_Bremser
Super User

Change your input statement:

input
  Day :date11.
  High
  Low
  Precip
  Snow
  SnowDepth
;

so that the dates are read correctly, and the numbers as numbers. Assign a date format to the day column.

Add a proper delimiter in the infile statement, if it is really a csv file (comma separated).

Reeza
Super User

@BoboTheFool wrote:

 Is there a way in SAS that I can change the format, or do I have to go to the CSV file and change all of my dates (although none of the options are SAS agreeable either)?

Change how you read it in so it's read correctly as a SAS date instead. Try either the anydtdte or date9 INFORMAT. An Informat specifies what format the variable is in to read it in correctly. A format specifies what the display or output value will look like. 

 

I'm surprised you're reading in your other variables, snow depth and precipitation as characters as well. You may want to modify those as well. 

 

libname Adam '/home/adamrowland0';


data Adam.InputData;
   
infile "/home/adamrowland0/sasuser.v94/Stat 510 Final Data Set.csv" dsd;
informat date anydtdte.;
format date date9.; *informat date date9.; input Date High Low Precip $ Snow $ SnowDepth $; run;

Random suggestions - don't call a data set final, that's a recipe for confusion later on especially as you work through it. If it's the cleansed input data maybe name it that? 

 

I also wouldn't recommend saving files in SASUSERv9.4, it's not a great place to store files, create a library for each project. It's also often locked down in many installations so you can't do it - SAS UE for example. 

 

You can convert after the fact using an INPUT() to convert a character variable to numeric, but best practice is to read it in correctly in the first place. Note you cannot use the same variable name to convert a variable type in SAS, you must use a new name. 

 

precip_numeric = input(precip, 8.);

You probably want:

 

data Adam.InputData;
   
infile "/home/adamrowland0/sasuser.v94/Stat 510 Final Data Set.csv" dsd;
informat date anydtdte.;
format date date9.;
*informat date date9.;

   input Date
   	 High 
   	 Low
   	 Precip  
   	 Snow 
   	 SnowDepth ;

  
run;

@BoboTheFool wrote:

I have a csv file which has 6 variables - Day, HIgh, Low, Precip, Snow, and SnowDepth.  In the CSV file the date displays as 1-Jan-2018, and if I make Day a character variable it reads in the same way, but I cannot do any manipulations with it, as it isn't in a SAS defined format.  Is there a way in SAS that I can change the format, or do I have to go to the CSV file and change all of my dates (although none of the options are SAS agreeable either)?

 

1-Jan-2018 -11.9 -24 0 0 2.99
2-Jan-2018 -0.9 -23.1 0 0 2.99

 

libname Adam '/home/adamrowland0';
data Adam.Final;
   infile "/home/adamrowland0/sasuser.v94/Stat 510 Final Data Set.csv" dsd;
   input Day $
   	 High 
   	 Low
   	 Precip  $
   	 Snow $
   	 SnowDepth $;

  
run;
 
Thank you for any help you can provide.
 
Adam

 

BoboTheFool
Calcite | Level 5
Thank you for the help! The reason I am reading in the last three variables as character variables is due to some of the values being 'T' instead of a number. While it automatically changes the 'T' to a '.' when it reads it in, the assignment I have necessitates me changing the 'T' to a proper missing denotation, '.', after the dataset has been read in, so I have to read them in as character values and then change them to numeric after changing them over. A lot of extra code, to be honest.
Reeza
Super User
That makes some sense, since you can have data sets with T to indicate missing or Z to indicate not measured and Y to indicate instrument failure. Basically you can have missing values for many reasons and sometimes that's denoted by using different letters to indicate the type of missing. You may need to handle them differently so reading them in as character and converting after the fact can make sense.
ballardw
Super User

@BoboTheFool wrote:
Thank you for the help! The reason I am reading in the last three variables as character variables is due to some of the values being 'T' instead of a number. While it automatically changes the 'T' to a '.' when it reads it in, the assignment I have necessitates me changing the 'T' to a proper missing denotation, '.', after the dataset has been read in, so I have to read them in as character values and then change them to numeric after changing them over. A lot of extra code, to be honest.

You can use custom informats to handle such things. Example:

proc format library=work;
invalue trace
'T'= .T
other = [F8.]
;
run;

data example;
   informat x trace.;
   input x;
datalines;
-11.9
T
17.3
;
run;

The above uses a special missing that by default shows the letter assigned with the special missing or a custom format could be used to display a different value. The numeric value is still missing and would not be used for calculations.

Tom
Super User Tom
Super User

@BoboTheFool wrote:
Thank you for the help! The reason I am reading in the last three variables as character variables is due to some of the values being 'T' instead of a number. While it automatically changes the 'T' to a '.' when it reads it in, the assignment I have necessitates me changing the 'T' to a proper missing denotation, '.', after the dataset has been read in, so I have to read them in as character values and then change them to numeric after changing them over. A lot of extra code, to be honest.

That is what the MISSING statement is for.  You use that to tell SAS which single letters it should read as special missing values.

missing t ;
data test;
  input id $ age ;
cards;
1 10
2 20
3  T
4  .
5 30
;

proc print; run;
proc means; run;
proc freq ;
 tables age / missing ;
run;
  

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 2034 views
  • 0 likes
  • 5 in conversation