My problem involves a non-tab delimited txt file that causes errors when using PROC IMPORT.
I've scoured the community page and StackOverflow and can't seem to find what I need. This is for an assignment so I have to use Proc Import on this data below. (It's the BigCompanies.txt dataset if that information helps) I am also using SAS University Edition.
1 Exxon Mobil United States $433.5B $41.1B $331.1B $407.4B 2 JPMorgan Chase United States $110.8B $19B $2,265.8B $170.1B 3 General Electric United States $147.3B $14.2B $717.2B $213.7B 4 Royal Dutch Shell Netherlands $470.2B $30.9B $340.5B $227.6B 5 ICBC China $82.6B $25.1B $2,039.1B $237.4B 6 HSBC Holdings United Kingdom $102B $16.2B $2,550B $164.3B 7 PetroChina China $310.1B $20.6B $304.7B $294.7B 8 Berkshire Hathaway United States $143.7B $10.3B $392.6B $202.2B 9 Wells Fargo United States $87.6B $15.9B $1,313.9B $178.7B 10 Petrobras-Petróleo Brasil Brazil $145.9B $20.1B $319.4B $180B
As you can see there is a large space between all of the variables. So I've tried manually inserting the proc import and using the import wizard to help.
/* Generated Code (IMPORT) */
/* Source File: BigCompanies.txt */
/* Source Path: /folders/myfolders */
/* Code generated on: 2/18/20, 12:47 PM */
%web_drop_table(WORK.Comp);
FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1';
PROC IMPORT DATAFILE=REFFILE
DBMS=DLM
OUT=WORK.Comp
REPLACE;
GETNAMES=NO;
GUESSINGROWS=max;
Delimiter='20'x;
RUN;
PROC CONTENTS DATA=WORK.Comp; RUN;
PROC PRINT DATA= Work.COMP (Obs=12);
RUN;
%web_open_table(WORK.Comp);
I'm still a beginner in SAS but I don't understand why it produces this
If I don't have Max for guessingrows it leaves out all of the rest of the variables but it has 55 variables and most of them are blanks.
All of the code I have tried is a simple modification of what is above. Trying different things. If I try having it read as a Tab instead of spaces all of the information is displayed in one line.
Any help would be nice. I just have a sinking feeling it's really simple.
It looks like you have what's called a fixed width file. So each field starts at a specific location and you need to read from that location. Unfortunately you have to specify the locations or have a file that does. Usually these files come with documentation that indicates the layout.
The second example here is one method of reading in the data:
https://stats.idre.ucla.edu/sas/modules/inputting-data-into-sas/
You cannot use PROC IMPORT, you need to write a data step to import this data set. Are you sure it's not tab delimited? Can you attach the actual file?
@rbivens wrote:
My problem involves a non-tab delimited txt file that causes errors when using PROC IMPORT.
I've scoured the community page and StackOverflow and can't seem to find what I need. This is for an assignment so I have to use Proc Import on this data below. (It's the BigCompanies.txt dataset if that information helps) I am also using SAS University Edition.
1 Exxon Mobil United States $433.5B $41.1B $331.1B $407.4B 2 JPMorgan Chase United States $110.8B $19B $2,265.8B $170.1B 3 General Electric United States $147.3B $14.2B $717.2B $213.7B 4 Royal Dutch Shell Netherlands $470.2B $30.9B $340.5B $227.6B 5 ICBC China $82.6B $25.1B $2,039.1B $237.4B 6 HSBC Holdings United Kingdom $102B $16.2B $2,550B $164.3B 7 PetroChina China $310.1B $20.6B $304.7B $294.7B 8 Berkshire Hathaway United States $143.7B $10.3B $392.6B $202.2B 9 Wells Fargo United States $87.6B $15.9B $1,313.9B $178.7B 10 Petrobras-Petróleo Brasil Brazil $145.9B $20.1B $319.4B $180BAs you can see there is a large space between all of the variables. So I've tried manually inserting the proc import and using the import wizard to help.
/* Generated Code (IMPORT) */ /* Source File: BigCompanies.txt */ /* Source Path: /folders/myfolders */ /* Code generated on: 2/18/20, 12:47 PM */ %web_drop_table(WORK.Comp); FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1'; PROC IMPORT DATAFILE=REFFILE DBMS=DLM OUT=WORK.Comp REPLACE; GETNAMES=NO; GUESSINGROWS=max; Delimiter='20'x; RUN; PROC CONTENTS DATA=WORK.Comp; RUN; PROC PRINT DATA= Work.COMP (Obs=12); RUN; %web_open_table(WORK.Comp);
I'm still a beginner in SAS but I don't understand why it produces this
If I don't have Max for guessingrows it leaves out all of the rest of the variables but it has 55 variables and most of them are blanks.
All of the code I have tried is a simple modification of what is above. Trying different things. If I try having it read as a Tab instead of spaces all of the information is displayed in one line.
Any help would be nice. I just have a sinking feeling it's really simple.
It looks like you have what's called a fixed width file. So each field starts at a specific location and you need to read from that location. Unfortunately you have to specify the locations or have a file that does. Usually these files come with documentation that indicates the layout.
The second example here is one method of reading in the data:
https://stats.idre.ucla.edu/sas/modules/inputting-data-into-sas/
You cannot use PROC IMPORT, you need to write a data step to import this data set. Are you sure it's not tab delimited? Can you attach the actual file?
@rbivens wrote:
My problem involves a non-tab delimited txt file that causes errors when using PROC IMPORT.
I've scoured the community page and StackOverflow and can't seem to find what I need. This is for an assignment so I have to use Proc Import on this data below. (It's the BigCompanies.txt dataset if that information helps) I am also using SAS University Edition.
1 Exxon Mobil United States $433.5B $41.1B $331.1B $407.4B 2 JPMorgan Chase United States $110.8B $19B $2,265.8B $170.1B 3 General Electric United States $147.3B $14.2B $717.2B $213.7B 4 Royal Dutch Shell Netherlands $470.2B $30.9B $340.5B $227.6B 5 ICBC China $82.6B $25.1B $2,039.1B $237.4B 6 HSBC Holdings United Kingdom $102B $16.2B $2,550B $164.3B 7 PetroChina China $310.1B $20.6B $304.7B $294.7B 8 Berkshire Hathaway United States $143.7B $10.3B $392.6B $202.2B 9 Wells Fargo United States $87.6B $15.9B $1,313.9B $178.7B 10 Petrobras-Petróleo Brasil Brazil $145.9B $20.1B $319.4B $180BAs you can see there is a large space between all of the variables. So I've tried manually inserting the proc import and using the import wizard to help.
/* Generated Code (IMPORT) */ /* Source File: BigCompanies.txt */ /* Source Path: /folders/myfolders */ /* Code generated on: 2/18/20, 12:47 PM */ %web_drop_table(WORK.Comp); FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1'; PROC IMPORT DATAFILE=REFFILE DBMS=DLM OUT=WORK.Comp REPLACE; GETNAMES=NO; GUESSINGROWS=max; Delimiter='20'x; RUN; PROC CONTENTS DATA=WORK.Comp; RUN; PROC PRINT DATA= Work.COMP (Obs=12); RUN; %web_open_table(WORK.Comp);
I'm still a beginner in SAS but I don't understand why it produces this
If I don't have Max for guessingrows it leaves out all of the rest of the variables but it has 55 variables and most of them are blanks.
All of the code I have tried is a simple modification of what is above. Trying different things. If I try having it read as a Tab instead of spaces all of the information is displayed in one line.
Any help would be nice. I just have a sinking feeling it's really simple.
Sure I can attach it. And yes I have made sure.
Is not using Proc Import just SAS U edition only or SAS in general?
This might be one of those assignment examples where it is supposed to fail and we are to explain the differences. I did import it using the INFILE method and it came out fine I was just worried that I was doing something horribly wrong as I couldn't find any example on going over this in my notes or online.
So I found an email from the teacher talking about the problem in question. He got it to work via the import Wizard but that it would be time consuming but I have enough trouble using the Import Wizard for SAS UE he also said he wouldn't take points off if we can still describe our process even if we failed.
Though now I have to go back and figure out how to read the $14B as a numeric via informat.
Thank you for your help so far!
Try using double blanks as delimiter by DLMSTR= option
FILENAME REFFILE '/folders/myfolders/BigCompanies.txt' ENCODING='wlatin1' DLMSTR=' ' ;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.