BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Zachary
Obsidian | Level 7

[Cross-posting from the Text Mining Community Forum]

I am fairly new to the Text Mining and I just went through the great course taught by some of the SAS folks. Just when I thought Enterprise Miner had a tremendous amount of bells, whistles, and options now I see that Text Mining has quite a lot of good stuff as well.

I am experimenting with the import of some data. I will be reading in hundreds of thousands of cases where about ten of them are sort of standard fields while two of them will be large text fields.

I THINK the best way to create this file is within Enterprise Miner instead of Enterprise Guide, but I am not 100% sure. My text file will be custom-delimited by |s.

But I wish to have a little control over the  length of my text field. Each of them will have up to 4,000 characters. I realize I can use the Guessing Rows property to try and make this work, but I am looking for a slightly finer level of control. Is there a way to specify the field length somewhere?

Also, while I am at it, is there a way to specify where I would like the SAS data file (.sas7bdat) to be placed and named?

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
rayIII
SAS Employee

Yes, for sure. Use a LIBNAME statement to specify where you want to write the SAS dataset. You can give the dataset any name you wish as long as it conforms to SAS naming conventions.

For example, this would write a dataset (you specify the name) to the path associated with the myplace library.

libname myplace 'your_path_goes_here';

Data myplace.dataset_name_you_choose_goes_here;

  Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2;

  Input ID Age Sex $ Income Race $ Hight Weight;

  run;

View solution in original post

3 REPLIES 3
M_Maldonado
Barite | Level 11

Hey Zachary,

Did you get what you needed from the Text Mining community?

There are a couple ways to do this. Since you have your file already delimited by pipes, you can use proc import with a dlm='|' on the infile option.

E.g. (change C:\Users\test.txt for the path and name of your data )

infile 'C:\Users\test.txt' dsd dlm='|';

I grabbed this example from Indiana University's Knowledge base:

Data test;

  Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2;

  Input ID Age Sex $ Income Race $ Hight Weight;

  run;

Source: https://kb.iu.edu/d/bcjf

More info about reading delimited text files here: http://support.sas.com/techsup/technote/ts673.pdf

I hope this helps,

Miguel

Zachary
Obsidian | Level 7

Thank you MIguel. You are the first to respond to my query. Thank you - it helped.

Onto the last-minute question I tacked-on at the end...

"is there a way to specify where I would like the SAS data file (.sas7bdat) to be placed and named?"

Thanks.

rayIII
SAS Employee

Yes, for sure. Use a LIBNAME statement to specify where you want to write the SAS dataset. You can give the dataset any name you wish as long as it conforms to SAS naming conventions.

For example, this would write a dataset (you specify the name) to the path associated with the myplace library.

libname myplace 'your_path_goes_here';

Data myplace.dataset_name_you_choose_goes_here;

  Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2;

  Input ID Age Sex $ Income Race $ Hight Weight;

  run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to choose a machine learning algorithm

Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 3 replies
  • 965 views
  • 3 likes
  • 3 in conversation