DATA Step, Macro, Functions and more

Reading raw data from an external files

Reply
Occasional Contributor
Posts: 6

Reading raw data from an external files

Hi,

 

I have a space delimeted data from external file where 1 record has a missing value but there is no '.' sign there only space is given there So when i am trying using dlm option=' ' or dsd option or both it doesn't provide me correct data.

 

here is the data:

M 50 68 155
F   60 101
M 65 72 220
F 35 65 133
M 15 71 166

 

 

 

 

Here is the code:

Age(2nd column) is missing in the 2nd row of data.

 

data demo;
infile "demo.txt" dlm=' ';
input gender $ age height weight;
run;

 

or

 

data demo;
infile "demo.txt" dlm=' ' dsd;
input gender $ age height weight;
run;

 

or

data demo;
infile "demo.txt" dsd;
input gender $ age height weight;
run;

 

Super User
Posts: 19,878

Re: Reading raw data from an external files

Posted in reply to komalbatra

I would have expected the second to work. 

Can you post the log from that run? 

Occasional Contributor
Posts: 6

Re: Reading raw data from an external files

here is the log:

 


NOTE: The infile "demo.txt" is:
File Name=C:\Users\Acer\Desktop\demo.txt,
RECFM=V,LRECL=256

NOTE: Invalid data for age in line 2 1-10.
NOTE: Invalid data for height in line 3 1-11.
NOTE: Invalid data for weight in line 4 1-11.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9--
4 F 35 65 133 11
NOTE: Invalid data errors for file '"demo.txt"' occurred outside the printed range.
NOTE: Increase available buffer lines with the INFILE n= option.
gender=M 50 68 age=. height=. weight=. _ERROR_=1 _N_=1
NOTE: LOST CARD.
gender=M 15 71 age=. height=. weight=. _ERROR_=1 _N_=2
NOTE: 5 records were read from the infile "demo.txt".
The minimum record length was 10.
The maximum record length was 11.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.DEMO has 1 observations and 4 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds

Occasional Contributor
Posts: 6

Re: Reading raw data from an external files

Sorry.. the log i pasted that was for 3rd syntax. while running second there is no error but data is coming incorrect. see attached screenshot.

Occasional Contributor
Posts: 6

Re: Reading raw data from an external files

Posted in reply to komalbatra
i think there is 3 spaces between next code that was reason for 2 missing value coming in the output .if i try with 2 spaces its coming fine. Thank you.
Valued Guide
Posts: 505

Re: Reading raw data from an external files

Posted in reply to komalbatra
Related post

If you are using the 'old text editor' the command 'cuth' will change mutiple blanks to one blank.

/* T000800 Old text editor macro to compress mutiple blanks to one blank in editor (like compbl)


Old text editor macro to compress mutiple blanks to one blank in editor (like compbl)

HAVE
====

INPUT ID        DATE              ABREV       COUNT        REGION    SALES
297945203711 2009.08.31             MO          1            KS    60
297945203711 2009.08.31             MO          2            KU    50
297945303711 2009.08.31             SS          1            MA    60
297945303711 2009.08.31             SS          1            KS    60
297945403711 2009.08.31             JO          1            HB    NA

WANT
====

INPUT ID DATE ABREV COUNT REGION SALES
297945203711 2009.08.31 MO 1 KS 60
297945203711 2009.08.31 MO 2 KU 50
297945303711 2009.08.31 SS 1 MA 60
297945303711 2009.08.31 SS 1 KS 60
297945403711 2009.08.31 JO 1 HB NA


WORKING CODE (may work in enhanced editor?)
===========================================

  %do i=1 %to 20;
    c '  ' ' ' all;
  %end;

DETAILS
=======

Highlight the code where you want to change mutiple blanks to one blank
Type cuth on the command line

ie
Command ==> cuth

00001 INPUT ID        DATE              ABREV       COUNT        REGION    SALES
00002 297945203711 2009.08.31             MO          1            KS    60
00003 297945203711 2009.08.31             MO          2            KU    50
00004 297945303711 2009.08.31             SS          1            MA    60
00005 297945303711 2009.08.31             SS          1            KS    60
00006 297945403711 2009.08.31             JO          1            HB    NA


FULL SOLUTION
=============

PUT THE MACRO BELOW IN YOUR AUTOCALL FOLDER.
Make sure CMDMAC is turned in.
options cmdmac;

Highlight the lines you want to compress and type 'cuth' on the commandline.

%macro cuth/cmd;
  %do i=1 %to 20;
    c '  ' ' ' all;
  %end;
%mend cutout;
Super User
Posts: 11,343

Re: Reading raw data from an external files

Posted in reply to komalbatra

komalbatra wrote:
i think there is 3 spaces between next code that was reason for 2 missing value coming in the output .if i try with 2 spaces its coming fine. Thank you.

DSD treats 2 consecutive delimiters as one missing, when you have 3 as the example data shows then the result is 2 missing variables.

The blanks at column 2 and 3 are the first pair and then at columns 3 and 4 are a second pair.

 

If possible determine why the data source is placing a space for a missing value and see if it may be corrected there, especially if this is going to be a process repeated on files going forward such as every week or

 

If the problem is regular enough this edit to the input line before inputting the variables may fix the problem:

data demo;
infile datalines dlm=' ' dsd;
input @;
_infile_= tranwrd(_infile_,'   ','  ');
input gender $   age  height weight;
datalines;
M 50 68 155
F   60 101
M 65 72 220
F 35 65 133
M 15 71 166
;
run;
Ask a Question
Discussion stats
  • 6 replies
  • 135 views
  • 2 likes
  • 4 in conversation