Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Cross-Post From Text Mining Community Forum - Newbie Question

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 115
Accepted Solution

Cross-Post From Text Mining Community Forum - Newbie Question

[Cross-posting from the Text Mining Community Forum]

I am fairly new to the Text Mining and I just went through the great course taught by some of the SAS folks. Just when I thought Enterprise Miner had a tremendous amount of bells, whistles, and options now I see that Text Mining has quite a lot of good stuff as well.

I am experimenting with the import of some data. I will be reading in hundreds of thousands of cases where about ten of them are sort of standard fields while two of them will be large text fields.

I THINK the best way to create this file is within Enterprise Miner instead of Enterprise Guide, but I am not 100% sure. My text file will be custom-delimited by |s.

But I wish to have a little control over the  length of my text field. Each of them will have up to 4,000 characters. I realize I can use the Guessing Rows property to try and make this work, but I am looking for a slightly finer level of control. Is there a way to specify the field length somewhere?

Also, while I am at it, is there a way to specify where I would like the SAS data file (.sas7bdat) to be placed and named?

Thank you.


Accepted Solutions
Solution
‎03-13-2015 02:52 PM
SAS Employee
Posts: 106

Re: Cross-Post From Text Mining Community Forum - Newbie Question

Yes, for sure. Use a LIBNAME statement to specify where you want to write the SAS dataset. You can give the dataset any name you wish as long as it conforms to SAS naming conventions.

For example, this would write a dataset (you specify the name) to the path associated with the myplace library.

libname myplace 'your_path_goes_here';

Data myplace.dataset_name_you_choose_goes_here;

  Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2;

  Input ID Age Sex $ Income Race $ Hight Weight;

  run;

View solution in original post


All Replies
Super Contributor
Posts: 337

Re: Cross-Post From Text Mining Community Forum - Newbie Question

Hey Zachary,

Did you get what you needed from the Text Mining community?

There are a couple ways to do this. Since you have your file already delimited by pipes, you can use proc import with a dlm='|' on the infile option.

E.g. (change C:\Users\test.txt for the path and name of your data )

infile 'C:\Users\test.txt' dsd dlm='|';

I grabbed this example from Indiana University's Knowledge base:

Data test;

  Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2;

  Input ID Age Sex $ Income Race $ Hight Weight;

  run;

Source: https://kb.iu.edu/d/bcjf

More info about reading delimited text files here: http://support.sas.com/techsup/technote/ts673.pdf

I hope this helps,

Miguel

Frequent Contributor
Posts: 115

Re: Cross-Post From Text Mining Community Forum - Newbie Question

Posted in reply to M_Maldonado

Thank you MIguel. You are the first to respond to my query. Thank you - it helped.

Onto the last-minute question I tacked-on at the end...

"is there a way to specify where I would like the SAS data file (.sas7bdat) to be placed and named?"

Thanks.

Solution
‎03-13-2015 02:52 PM
SAS Employee
Posts: 106

Re: Cross-Post From Text Mining Community Forum - Newbie Question

Yes, for sure. Use a LIBNAME statement to specify where you want to write the SAS dataset. You can give the dataset any name you wish as long as it conforms to SAS naming conventions.

For example, this would write a dataset (you specify the name) to the path associated with the myplace library.

libname myplace 'your_path_goes_here';

Data myplace.dataset_name_you_choose_goes_here;

  Infile "C:\Users\test.txt" DLM='|' DSD LRECL=400 FIRSTOBS=2;

  Input ID Age Sex $ Income Race $ Hight Weight;

  run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 3 replies
  • 386 views
  • 3 likes
  • 3 in conversation