BookmarkSubscribeRSS Feed
dunga
Obsidian | Level 7

Hi SAS Forum,
I am having the attached “.txt” file that should be converted to a SAS data set.

It has following 8 variables. And has 5 observations..


Category
SurName 
First_Name 
salary_1 
salary_2 
company 
Profession 
Year

 

Q:

When I imported using "Import Data" functionality in SAS, first 3 records get correctly imported.

Howerver, a part of values for "profession" variable for last 3 records get running into the next variable, which is "Income".

I have attahced an Excel table to show the wrong and correct answers.

 

Could you help me to correctly import the ".txt" file.

Thanks

 

Mirisa

8 REPLIES 8
dunga
Obsidian | Level 7

Hi Forum again,

This is the txt file needs to be converted to a SAS data set.

Mirisa

Jagadishkatam
Amethyst | Level 16

With the below code i was able to generate the dataset want. The delimiter need to be used is tab like delimiter='09'x

 

the only exemption i observed is that the last three observations do not have a year, the people held multiple professions. So had to slightly update the txt file as the multiple professions were separated by tab instead i updated to a single space. due to which now i get year blank. Hope this is what you are expecting.

 

proc import datafile='~path\post_2017_Jul_15.txt' out=want
dbms=dlm replace;
delimiter='09'x;
getnames=yes;
run;

 

Thanks,
Jag
LinusH
Tourmaline | Level 20
Since there are tons of ways to import data in SAS, please clarify how you are doing the import, attaching the code and log if possible.
Data never sleeps
ballardw
Super User

What options did you use when importing the text?

It appears that you text may be tab delimited text. And that you have tabs beford "Member (Part-time) / Membre (à temps partiel)" and "Director of Clinical Training" which why those appear under the column heading "Year".

 

It helps to talk about the column labels as appearing in the data NOT a concept like "Income" which does not appear in the data

 

If you used the wizard to import the data the log should contain data step code used to read the file. Post that along with any messages from from log. Please use a code box opened with the forum {i} menu icon to preserve formatting of the code as the message windows will do some text reformatting.

 

 

dunga
Obsidian | Level 7

 

/* --------------------------------------------------------------------
   Code generated by a SAS task
   
   Generated on Sunday, July 16, 2017 at 10:37:09 AM
   By task:     Import Data Wizard
   
   Source file: xxx:\post_2017_Jul_15.txt
   Server:      Local File System
   
   Output data: WORK.post_2017_Jul_151
   Server:      
   
   Note: In preparation for running the following code, the Import
   Data wizard has used internal routines to transfer the source data
   file from the local file system to xxxxxxx. There is no SAS code
   available to represent this action.
   -------------------------------------------------------------------- */

DATA WORK.post_2017_Jul_151;
    LENGTH
        Category         $ 14
        SurName          $ 6
        First_Name       $ 7
        salary_1           8
        salary_2           8
        company          $ 25
        Profession       $ 63
        Year             $ 45 ;
    FORMAT
        Category         $CHAR14.
        SurName          $CHAR6.
        First_Name       $CHAR7.
        salary_1         DOLLAR12.2
        salary_2         DOLLAR7.2
        company          $CHAR25.
        Profession       $CHAR63.
        Year             $CHAR45. ;
    INFORMAT
        Category         $CHAR14.
        SurName          $CHAR6.
        First_Name       $CHAR7.
        salary_1         DOLLAR12.
        salary_2         DOLLAR7.
        company          $CHAR25.
        Profession       $CHAR63.
        Year             $CHAR45. ;
    INFILE 'T:\SASWork\_xxxxxxxxxxxx_\#LN00038'
        LRECL=181
        ENCODING="WLATIN1"
        TERMSTR=CRLF
        DLM='7F'x
        MISSOVER
        DSD ;
    INPUT
        Category         : $CHAR14.
        SurName          : $CHAR6.
        First_Name       : $CHAR7.
        salary_1         : ?? DOLLAR12.
        salary_2         : ?? DOLLAR7.
        company          : $CHAR25.
        Profession       : $CHAR63.
        Year             : $CHAR45. ;
RUN;

Hi Jag, LinusH, Ballrdw,

 

 

Thanks to each one of you. Please see the log attahced. Also attahced screenshots to show how I have imported.

 

Hi Jag,

Actually last three observations should not have a year, which is what I want.

I think you have manually updated the txt file to a single space to get the correct answer (as the multiple professions were separated by tab instead you updated to a single space).

 

However, my original file is having over million records, so I cannot update it to a single space manually.

 

Could you help me.

 

 

 

 

 

art297
Opal | Level 21

Not sure if you want to cleanup the non-ascii characters in Profession, or what the maximum length of that field is, but the following will let you read all of the fields and let you adjust as necessary:

data want (drop=csalary_1 position length);
  infile '/folders/myfolders/post_2017_Jul_15.txt' dlm='09'x lrecl=32767 firstobs=2;
  input @;
  if not anyalpha(scan(_infile_,-1,'09'x)) then do;
    year=input(scan(_infile_,-1,'09'x),4.);
    CALL SCAN(_infile_, -1, position, length, '09'x);
    _infile_=substr(_infile_,1,position-2);
  end;
  informat
    Category         $14.
    SurName          $20.
    First_Name       $20.
    csalary_1         $14.
    salary_2         DOLLAR7.
    company          $25.
  ;
  length profession $512;
  input Category SurName First_Name csalary_1 salary_2 company@;
  salary_1=input(compress(csalary_1,'"'),dollar12.);
  CALL SCAN(_infile_, 6, position, length, '09'x);
  _infile_=translate(substr(_infile_,position+length+1),'/','09'x);
  profession=_infile_;
  input;
run;

Art, CEO, AnalystFinder.com

 

dunga
Obsidian | Level 7

Thanks Art for your code.

It works correctly for the sample dataset I have attahced.

However, I was not able to adjust it to my big dataset (but it is my problem).

Thanks

Mirisa

 

art297
Opal | Level 21

In order for anyone to be able to help you would have to post a couple of the lines that didn't import correctly.

 

Art, CEO, AnalystFinder.com

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2534 views
  • 2 likes
  • 5 in conversation