SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Reading a Text file with a header record format and separate data record format

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 6
Accepted Solution

Reading a Text file with a header record format and separate data record format

Hi,

 

I am having trouble reading a file into sas because it has two different data record formats, a header format and data format. 

 

I have attached the format of the data below, I am fairly new to sas and I am not sure how to make the data read with header(columns) the Date (Hour month day year) as the first column and then the variables of the record format.  

 

 

I am not even quite sure how to start, I have uploaded files to sas before however, usually they are arranged in a neater format, I attacthed the data file if that helps. 

 

Any help or even a starting point could help.

 

Header: 

 

Variable   Columns  Type

---------------------------------

HEADREC       1-  1  Character

ID            2- 12  Character

YEAR         14- 17  Integer

MONTH        19- 20  Integer

DAY          22- 23  Integer

HOUR         25- 26  Integer

RELTIME      28- 31  Integer

NUMLEV       33- 36  Integer

P_SRC        38- 45  Character

NP_SRC       47- 54  Character

LAT          56- 62  Integer

LON          64- 71  Integer

 

 

 

Data Record Format:    

LVLTYP1         1-  1   Integer

LVLTYP2         2-  2   Integer

ETIME           4-  8   Integer

PRESS          10- 15   Integer

PFLAG          16- 16   Character

GPH            17- 21   Intege

rZFLAG          22- 22   Character

TEMP           23- 27   Integer

TFLAG          28- 28   Character

RH             29- 33   Integer

DPDP           35- 39   Integer

WDIR           41- 45   Integer

WSPD           47- 51   Integer

 

 

 

example of the data

 

#CAM00071109 1952 10 31 03 9999 14 ncar-ccd 506850 -1273758
21 -9999 101400B 19 72B 933 -9999 -9999 -9999
10 -9999 100000 134B 72B 933 -9999 -9999 -9999
10 -9999 85000 1439B -8B 863 -9999 -9999 -9999
10 -9999 70000 2978B -102B 748 -9999 -9999 -9999
20 -9999 57600 -9999 -208B 838 -9999 -9999 -9999
20 -9999 55000 -9999 -235B 614 -9999 -9999 -9999
10 -9999 50000 5480B -278B 596 -9999 -9999 -9999
10 -9999 40000 7041B -405B 671 -9999 -9999 -9999
10 -9999 30000 8955B -505B-9999 -9999 -9999 -9999
20 -9999 27300 -9999 -530A-9999 -9999 -9999 -9999
10 -9999 20000 11595B -482B-9999 -9999 -9999 -9999
10 -9999 15000 13481B -502A-9999 -9999 -9999 -9999
10 -9999 10000 16142B -505A-9999 -9999 -9999 -9999
20 -9999 6200 -9999 -530A-9999 -9999 -9999 -9999


Accepted Solutions
Solution
‎04-24-2017 03:02 PM
Super User
Posts: 10,500

Re: Reading a Text file with a header record format and separate data record format

You may want to repost your example data in a code box opened by using the forum {i} menu icon.

as posted when I copy the data to a text file your column 23 has the letter B for the first data record which is not valid for the TEMP variable as numeric.

 

Sometimes the forum main window "helps" by reformatting text and may have removed sequential spaces or possibly tabs.

 

This may be a starting example. I reformatted the data in the editor to better match the input descriptions. The numerics are defaulting to best. formats which look like integers.

data read;
input @@;
if substr(_infile_,1,1)="#" then
input 
   HEADREC   $   1 
   ID        $   2- 12  
   YEAR         14- 17  
   MONTH        19- 20  
   DAY          22- 23  
   HOUR         25- 26  
   RELTIME      28- 31  
   NUMLEV       33- 36  
   P_SRC    $   38- 45  
   NP_SRC   $   47- 54  
   LAT          56- 62  
   LON          64- 71  
   ;
else input 
LVLTYP1         1   
LVLTYP2         2   
ETIME           4-  8   
PRESS          10- 15   
PFLAG     $    16- 16  
GPH            17   
rZFLAG    $    22- 22  
TEMP           23- 27   
TFLAG     $    28- 28  
RH             29- 33   
DPDP           35- 39   
WDIR           41- 45   
WSPD           47- 51   
; 
datetime=dhms(mdy(month,day,year),hour,0,0);
format datetime datetime16.;
datalines;
#CAM00071109 1952 10 31 03 9999 14  ncar-ccd 506850 -1273758
21 -9999 101400B 19      72B 933 -9999 -9999 -9999
10 -9999 100000 134B     72B 933 -9999 -9999 -9999
10 -9999 85000 1439B     -8B 863 -9999 -9999 -9999
10 -9999 70000 2978B   -102B 748 -9999 -9999 -9999
20 -9999 57600 -9999   -208B 838 -9999 -9999 -9999
20 -9999 55000 -9999   -235B 614 -9999 -9999 -9999
10 -9999 50000 5480B   -278B 596 -9999 -9999 -9999
10 -9999 40000 7041B   -405B 671 -9999 -9999 -9999
10 -9999 30000 8955B   -505B-9999 -9999 -9999 -9999
20 -9999 27300 -9999   -530A-9999 -9999 -9999 -9999
10 -9999 20000 11595B  -482B-9999 -9999 -9999 -9999
10 -9999 15000 13481B  -502A-9999 -9999 -9999 -9999
10 -9999 10000 16142B  -505A-9999 -9999 -9999 -9999
20 -9999 6200  -9999   -530A-9999 -9999 -9999 -9999
;
run;

If you want the values from the header record to be one the data records add a RETAIN statement listing the header variables of interest. I also created a datetime variable as putting date/time information together is often better than the separate fields. Note though that if any of the incoming values are not valid dates, ie month=2 and day=31, then the datetime is going to through an error about invalid date.

 

View solution in original post


All Replies
Super User
Posts: 17,823

Re: Reading a Text file with a header record format and separate data record format

If the file is formatted except header, write a data step to import the remaining records and ignore the first record entirely. 

Occasional Contributor
Posts: 6

Re: Reading a Text file with a header record format and separate data record format

It has a changing Header such as

 

The data goes from 1952 to Today, The header contains the date/time, and below it is data associated with that

 

#CAM00071109 1951 02 01 03 9999 14 ncar-ccd 506850 -1273758
21 -9999 102000B 19 -5B 743 -9999 -9999 -9999
10 -9999 100000 180B -18B 718 -9999 -9999 -9999
10 -9999 85000 1451B -98B 852 -9999 -9999 -9999
20 -9999 81400 -9999 -92B 924 -9999 -9999 -9999
20 -9999 75700 -9999 -95B 946 -9999 -9999 -9999
10 -9999 70000 2950B -122B 922 -9999 -9999 -9999
20 -9999 55500 -9999 -228B 882 -9999 -9999 -9999
20 -9999 52700 -9999 -245B 556 -9999 -9999 -9999
10 -9999 50000 5447B -272B 604 -9999 -9999 -9999
10 -9999 30000 8946B -500B-9999 -9999 -9999 -9999
20 -9999 21700 -9999 -525A-9999 -9999 -9999 -9999
10 -9999 20000 11576B -518B-9999 -9999 -9999 -9999
20 -9999 16400 -9999 -502B-9999 -9999 -9999 -9999
20 -9999 13200 -9999 -525B-9999 -9999 -9999 -9999
#CAM00071109 1951 02 03 03 9999 7 ncar-ccd 506850 -1273758
21 -9999 99900B 19 35B 952 -9999 -9999 -9999
20 -9999 93500 -9999 22B 866 -9999 -9999 -9999
10 -9999 85000 1317B -28B 928 -9999 -9999 -9999
20 -9999 74500 -9999 -102B 785 -9999 -9999 -9999
10 -9999 30000 8769B -518B-9999 -9999 -9999 -9999
20 -9999 26200 -9999 -538A-9999 -9999 -9999 -9999
20 -9999 21700 -9999 -535A-9999 -9999 -9999 -9999
#CAM00071109 1951 02 04 03 9999 11 ncar-ccd 506850 -1273758
21 -9999 100900B 19 25B 951 -9999 -9999 -9999
10 -9999 100000 91B 25B 911 -9999 -9999 -9999
20 -9999 95000 -9999 18B 804 -9999 -9999 -9999
10 -9999 85000 1393B -42B 820 -9999 -9999 -9999
10 -9999 70000 2902B -122B 782 -9999 -9999 -9999
20 -9999 53000 -9999 -232B 615 -9999 -9999 -9999
10 -9999 50000 5404B -265B 606 -9999 -9999 -9999
10 -9999 30000 8882B -535B-9999 -9999 -9999 -9999
20 -9999 24000 -9999 -608A-9999 -9999 -9999 -9999
10 -9999 20000 11445B -595B-9999 -9999 -9999 -9999
20 -9999 16400 -9999 -585A-9999 -9999 -9999 -9999

Super User
Posts: 17,823

Re: Reading a Text file with a header record format and separate data record format

[ Edited ]

The # symbol is only in front of your 'header' lines?

 

If so, create three input statements. 

The first reads the first character, if it's a # use the input from header record line. 

The second reads the record otherwise. You can use two trailing @@ to keep the pointer on the same line.

 

This isn't a full solution but gives you the idea. Everything is read in as character here but you should customize that.

 

data have;
infile cards truncover;
informat var1-var10 vart1-vart9 $20.;
input @1 char1 $1. @@;
if char1='#' then 
   input var1-var10 $;
else input varT1-varT9 $ ;
retain var1-var10;
cards;
#CAM00071109 1951 02 01 03 9999 14 ncar-ccd 506850 -1273758
21 -9999 102000B 19 -5B 743 -9999 -9999 -9999
10 -9999 100000 180B -18B 718 -9999 -9999 -9999
10 -9999 85000 1451B -98B 852 -9999 -9999 -9999
20 -9999 81400 -9999 -92B 924 -9999 -9999 -9999
20 -9999 75700 -9999 -95B 946 -9999 -9999 -9999
10 -9999 70000 2950B -122B 922 -9999 -9999 -9999
20 -9999 55500 -9999 -228B 882 -9999 -9999 -9999
20 -9999 52700 -9999 -245B 556 -9999 -9999 -9999
10 -9999 50000 5447B -272B 604 -9999 -9999 -9999
10 -9999 30000 8946B -500B-9999 -9999 -9999 -9999
20 -9999 21700 -9999 -525A-9999 -9999 -9999 -9999
10 -9999 20000 11576B -518B-9999 -9999 -9999 -9999
20 -9999 16400 -9999 -502B-9999 -9999 -9999 -9999
20 -9999 13200 -9999 -525B-9999 -9999 -9999 -9999
#CAM00071109 1951 02 03 03 9999 7 ncar-ccd 506850 -1273758
21 -9999 99900B 19 35B 952 -9999 -9999 -9999
20 -9999 93500 -9999 22B 866 -9999 -9999 -9999
10 -9999 85000 1317B -28B 928 -9999 -9999 -9999
20 -9999 74500 -9999 -102B 785 -9999 -9999 -9999
10 -9999 30000 8769B -518B-9999 -9999 -9999 -9999
20 -9999 26200 -9999 -538A-9999 -9999 -9999 -9999
20 -9999 21700 -9999 -535A-9999 -9999 -9999 -9999
;;;;
run;
Solution
‎04-24-2017 03:02 PM
Super User
Posts: 10,500

Re: Reading a Text file with a header record format and separate data record format

You may want to repost your example data in a code box opened by using the forum {i} menu icon.

as posted when I copy the data to a text file your column 23 has the letter B for the first data record which is not valid for the TEMP variable as numeric.

 

Sometimes the forum main window "helps" by reformatting text and may have removed sequential spaces or possibly tabs.

 

This may be a starting example. I reformatted the data in the editor to better match the input descriptions. The numerics are defaulting to best. formats which look like integers.

data read;
input @@;
if substr(_infile_,1,1)="#" then
input 
   HEADREC   $   1 
   ID        $   2- 12  
   YEAR         14- 17  
   MONTH        19- 20  
   DAY          22- 23  
   HOUR         25- 26  
   RELTIME      28- 31  
   NUMLEV       33- 36  
   P_SRC    $   38- 45  
   NP_SRC   $   47- 54  
   LAT          56- 62  
   LON          64- 71  
   ;
else input 
LVLTYP1         1   
LVLTYP2         2   
ETIME           4-  8   
PRESS          10- 15   
PFLAG     $    16- 16  
GPH            17   
rZFLAG    $    22- 22  
TEMP           23- 27   
TFLAG     $    28- 28  
RH             29- 33   
DPDP           35- 39   
WDIR           41- 45   
WSPD           47- 51   
; 
datetime=dhms(mdy(month,day,year),hour,0,0);
format datetime datetime16.;
datalines;
#CAM00071109 1952 10 31 03 9999 14  ncar-ccd 506850 -1273758
21 -9999 101400B 19      72B 933 -9999 -9999 -9999
10 -9999 100000 134B     72B 933 -9999 -9999 -9999
10 -9999 85000 1439B     -8B 863 -9999 -9999 -9999
10 -9999 70000 2978B   -102B 748 -9999 -9999 -9999
20 -9999 57600 -9999   -208B 838 -9999 -9999 -9999
20 -9999 55000 -9999   -235B 614 -9999 -9999 -9999
10 -9999 50000 5480B   -278B 596 -9999 -9999 -9999
10 -9999 40000 7041B   -405B 671 -9999 -9999 -9999
10 -9999 30000 8955B   -505B-9999 -9999 -9999 -9999
20 -9999 27300 -9999   -530A-9999 -9999 -9999 -9999
10 -9999 20000 11595B  -482B-9999 -9999 -9999 -9999
10 -9999 15000 13481B  -502A-9999 -9999 -9999 -9999
10 -9999 10000 16142B  -505A-9999 -9999 -9999 -9999
20 -9999 6200  -9999   -530A-9999 -9999 -9999 -9999
;
run;

If you want the values from the header record to be one the data records add a RETAIN statement listing the header variables of interest. I also created a datetime variable as putting date/time information together is often better than the separate fields. Note though that if any of the incoming values are not valid dates, ie month=2 and day=31, then the datetime is going to through an error about invalid date.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 224 views
  • 0 likes
  • 3 in conversation