BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rbivens
Fluorite | Level 6

My problem involves a non-tab delimited txt file that causes errors when using PROC IMPORT.

 

I've scoured the community page and StackOverflow and can't seem to find what I need. This is for an assignment so I have to use Proc Import on this data below. (It's the BigCompanies.txt dataset if that information helps) I am also using SAS University Edition.

 

 

1    Exxon Mobil                  United States     $433.5B   $41.1B   $331.1B   $407.4B
2    JPMorgan Chase               United States     $110.8B     $19B $2,265.8B   $170.1B
3    General Electric             United States     $147.3B   $14.2B   $717.2B   $213.7B
4    Royal Dutch Shell            Netherlands       $470.2B   $30.9B   $340.5B   $227.6B
5    ICBC                         China              $82.6B   $25.1B $2,039.1B   $237.4B
6    HSBC Holdings                United Kingdom      $102B   $16.2B   $2,550B   $164.3B
7    PetroChina                   China             $310.1B   $20.6B   $304.7B   $294.7B
8    Berkshire Hathaway           United States     $143.7B   $10.3B   $392.6B   $202.2B
9    Wells Fargo                  United States      $87.6B   $15.9B $1,313.9B   $178.7B
10   Petrobras-Petróleo Brasil    Brazil            $145.9B   $20.1B   $319.4B     $180B

As you can see there is a large space between all of the variables. So I've tried manually inserting the proc import and using the import wizard to help.

 

 

/* Generated Code (IMPORT) */
/* Source File: BigCompanies.txt */
/* Source Path: /folders/myfolders */
/* Code generated on: 2/18/20, 12:47 PM */

%web_drop_table(WORK.Comp);


FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1';

PROC IMPORT DATAFILE=REFFILE
	DBMS=DLM
	OUT=WORK.Comp
	REPLACE;
	GETNAMES=NO;
	GUESSINGROWS=max;
	Delimiter='20'x;
RUN;

PROC CONTENTS DATA=WORK.Comp; RUN;

PROC PRINT DATA= Work.COMP (Obs=12);
RUN;

%web_open_table(WORK.Comp);

I'm still a beginner in SAS but I don't understand why it produces this

 

 

 

If I don't have Max for guessingrows it leaves out all of the rest of the variables but it has 55 variables and most of them are blanks. image.png

 

 

 

 

 

 

All of the code I have tried is a simple modification of what is above. Trying different things. If I try having it read as a Tab instead of spaces all of the information is displayed in one line.

 

Any help would be nice. I just have a sinking feeling it's really simple.

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

It looks like you have what's called a fixed width file. So each field starts at a specific location and you need to read from that location. Unfortunately you have to specify the locations or have a file that does. Usually these files come with documentation that indicates the layout. 

 

The second example here is one method of reading in the data:

https://stats.idre.ucla.edu/sas/modules/inputting-data-into-sas/

 

You cannot use PROC IMPORT, you need to write a data step to import this data set. Are you sure it's not tab delimited? Can you attach the actual file?

 


@rbivens wrote:

My problem involves a non-tab delimited txt file that causes errors when using PROC IMPORT.

 

I've scoured the community page and StackOverflow and can't seem to find what I need. This is for an assignment so I have to use Proc Import on this data below. (It's the BigCompanies.txt dataset if that information helps) I am also using SAS University Edition.

 

 

1    Exxon Mobil                  United States     $433.5B   $41.1B   $331.1B   $407.4B
2    JPMorgan Chase               United States     $110.8B     $19B $2,265.8B   $170.1B
3    General Electric             United States     $147.3B   $14.2B   $717.2B   $213.7B
4    Royal Dutch Shell            Netherlands       $470.2B   $30.9B   $340.5B   $227.6B
5    ICBC                         China              $82.6B   $25.1B $2,039.1B   $237.4B
6    HSBC Holdings                United Kingdom      $102B   $16.2B   $2,550B   $164.3B
7    PetroChina                   China             $310.1B   $20.6B   $304.7B   $294.7B
8    Berkshire Hathaway           United States     $143.7B   $10.3B   $392.6B   $202.2B
9    Wells Fargo                  United States      $87.6B   $15.9B $1,313.9B   $178.7B
10   Petrobras-Petróleo Brasil    Brazil            $145.9B   $20.1B   $319.4B     $180B

As you can see there is a large space between all of the variables. So I've tried manually inserting the proc import and using the import wizard to help.

 

 

/* Generated Code (IMPORT) */
/* Source File: BigCompanies.txt */
/* Source Path: /folders/myfolders */
/* Code generated on: 2/18/20, 12:47 PM */

%web_drop_table(WORK.Comp);


FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1';

PROC IMPORT DATAFILE=REFFILE
	DBMS=DLM
	OUT=WORK.Comp
	REPLACE;
	GETNAMES=NO;
	GUESSINGROWS=max;
	Delimiter='20'x;
RUN;

PROC CONTENTS DATA=WORK.Comp; RUN;

PROC PRINT DATA= Work.COMP (Obs=12);
RUN;

%web_open_table(WORK.Comp);

I'm still a beginner in SAS but I don't understand why it produces this

 

 

 

If I don't have Max for guessingrows it leaves out all of the rest of the variables but it has 55 variables and most of them are blanks. image.png

 

 

 

 

 

 

All of the code I have tried is a simple modification of what is above. Trying different things. If I try having it read as a Tab instead of spaces all of the information is displayed in one line.

 

Any help would be nice. I just have a sinking feeling it's really simple.


 

View solution in original post

9 REPLIES 9
Reeza
Super User

It looks like you have what's called a fixed width file. So each field starts at a specific location and you need to read from that location. Unfortunately you have to specify the locations or have a file that does. Usually these files come with documentation that indicates the layout. 

 

The second example here is one method of reading in the data:

https://stats.idre.ucla.edu/sas/modules/inputting-data-into-sas/

 

You cannot use PROC IMPORT, you need to write a data step to import this data set. Are you sure it's not tab delimited? Can you attach the actual file?

 


@rbivens wrote:

My problem involves a non-tab delimited txt file that causes errors when using PROC IMPORT.

 

I've scoured the community page and StackOverflow and can't seem to find what I need. This is for an assignment so I have to use Proc Import on this data below. (It's the BigCompanies.txt dataset if that information helps) I am also using SAS University Edition.

 

 

1    Exxon Mobil                  United States     $433.5B   $41.1B   $331.1B   $407.4B
2    JPMorgan Chase               United States     $110.8B     $19B $2,265.8B   $170.1B
3    General Electric             United States     $147.3B   $14.2B   $717.2B   $213.7B
4    Royal Dutch Shell            Netherlands       $470.2B   $30.9B   $340.5B   $227.6B
5    ICBC                         China              $82.6B   $25.1B $2,039.1B   $237.4B
6    HSBC Holdings                United Kingdom      $102B   $16.2B   $2,550B   $164.3B
7    PetroChina                   China             $310.1B   $20.6B   $304.7B   $294.7B
8    Berkshire Hathaway           United States     $143.7B   $10.3B   $392.6B   $202.2B
9    Wells Fargo                  United States      $87.6B   $15.9B $1,313.9B   $178.7B
10   Petrobras-Petróleo Brasil    Brazil            $145.9B   $20.1B   $319.4B     $180B

As you can see there is a large space between all of the variables. So I've tried manually inserting the proc import and using the import wizard to help.

 

 

/* Generated Code (IMPORT) */
/* Source File: BigCompanies.txt */
/* Source Path: /folders/myfolders */
/* Code generated on: 2/18/20, 12:47 PM */

%web_drop_table(WORK.Comp);


FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1';

PROC IMPORT DATAFILE=REFFILE
	DBMS=DLM
	OUT=WORK.Comp
	REPLACE;
	GETNAMES=NO;
	GUESSINGROWS=max;
	Delimiter='20'x;
RUN;

PROC CONTENTS DATA=WORK.Comp; RUN;

PROC PRINT DATA= Work.COMP (Obs=12);
RUN;

%web_open_table(WORK.Comp);

I'm still a beginner in SAS but I don't understand why it produces this

 

 

 

If I don't have Max for guessingrows it leaves out all of the rest of the variables but it has 55 variables and most of them are blanks. image.png

 

 

 

 

 

 

All of the code I have tried is a simple modification of what is above. Trying different things. If I try having it read as a Tab instead of spaces all of the information is displayed in one line.

 

Any help would be nice. I just have a sinking feeling it's really simple.


 

rbivens
Fluorite | Level 6

Sure I can attach it. And yes I have made sure.

 

 

When it is read as a TABWhen it is read as a TAB

Is not using Proc Import just SAS U edition only or SAS in general?

Reeza
Super User
Yeah, you'll have to write the code for that. If you look at your first line you'll see you have fields like 'Exxon Mobil" so you cannot use space as your delimiter since there are spaces in the field as well. Note that you'll want to read in your amounts as characters and convert them due to the B at the end or you'll have to write your own informat.
rbivens
Fluorite | Level 6

This might be one of those assignment examples where it is supposed to fail and we are to explain the differences. I did import it using the INFILE method and it came out fine I was just worried that I was doing something horribly wrong as I couldn't find any example on going over this in my notes or online.

Reeza
Super User
I'd recommend asking your instructor then.
rbivens
Fluorite | Level 6

So I found an email from the teacher talking about the problem in question. He got it to work via the import Wizard but that it would be time consuming but I have enough trouble using the Import Wizard for SAS UE he also said he wouldn't take points off if we can still describe our process even if we failed. 

 

Though now I have to go back and figure out how to read the $14B as a numeric via informat. 

 

Thank you for your help so far!

Reeza
Super User
This would be the same regardless if it's SAS UE or SAS Base or SAS EG - you'd have to do the same thing. I think the GUI let's you point and click to generate the code in Base/EG but I'm not sure Studio has that feature. It's like trying to read it into Excel, you'd have to open Excel, import data and then specify where each field started/ended manually by clicking.
Ksharp
Super User

Try using double blanks as delimiter by DLMSTR= option

 

FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1'  DLMSTR='  ' ;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1108 views
  • 3 likes
  • 3 in conversation