BookmarkSubscribeRSS Feed
avil1991
Calcite | Level 5

Hi

Let's say I have a dataset as below. I want to read the data but I am stuck with column names separated by space "Date Of Birth"

I can easily replace space with "_" but what if I have a large number of variables?  or a bracket or + sign in the column name.

 

Also with the informat  I can read the date variable for 1 particular date type but what about others?

 

NameMarks( secd)Age+Date of Birth
A833228-May-90
B842513-Feb-1999
C302626-10-1990
D1006316-Mar-00
5 REPLIES 5
Reeza
Super User

Is this a text file?

If so, please post a small sample. It's important to post the original file (or a very close replica) to see if there are special characters.

avil1991
Calcite | Level 5

Sorry, its just a general question which came to me while studying. There are no such files.

Let's assume its a text or a CSV file. 

Reeza
Super User

Then the best answer is send the file back and ask for a better formatted file.

 

90% of the time, even if the field is date of birth there is another delimiter, ie a comma which makes the file a CSV file

Otherwise the work around tends to be to read the data but skip the first line so you manually write your input step and bypass the first line entirely. If you absolutely need it automated, you can read the first line in a second step. 

There are posts on here that show how using this approach by Ksharp.

ballardw
Super User

And if you have reason to believe that your date column has differing formats of the original content then behaps the ANYDTDTE informat will work. Please see the documentation.

Tom
Super User Tom
Super User

When reading the file you normally skip the header line. So spaces in the variable names shouldn't matter.

data want ;
  infile 'myfile.dat' firstobs=2 truncover ;
  length name $20 Marks Age 8 DOBchar $12 ;
  input name -- DOBchar ;
run;

If you are using PROC IMPORT or some other tool that will try to guess at what your file contains then it will have algorithms for converting the invalid characters into valid variable names.

 

As to converting the gibberish in the DOB field you probably will need to read it as character and write your own algorithm to see if you can convert it to a real date.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 838 views
  • 0 likes
  • 4 in conversation