I'm trying to read in this set of data and I need some help:
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
I need the output to look like this:
I have come up with this solution which works fine:
Data Yourdata (drop=temp);
Infile datalines dsd;
retain var1-var3;
input temp : $10. @;
if length(temp) ^= 1 then do;
Input @1 var1 $10.
@12 var2 $
var3 $
var4
var5 $
var6 : date9.
var7 $
var8 $
var9
var10 $
var11 ;
end; else do;
Input @1 var4
var5 $
var6 : date9.
var7 $
var8 $
var9
var10 $
var11 ;
end;
format var6 date9.;
datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;However, I'm partially using formatted input instead of list input and I'm ignoring the delimiter for some of the columns. I am wondering if there is a better solution for this.
Can anyone help? Thanks 🙂
Here is one possible alternative:
Data Yourdata;
infile datalines dsd delimiter=',';
informat var1 $10.;
informat var2 var3 $1.;
informat var4 8.;
informat var5 $1.;
informat var6 date9.;
informat var7 var8 $1.;
informat var9 8.;
informat var10 $2.;
informat var11 8.;
retain var1-var3;
input @;
if length(scan(_infile_,1,',')) ^= 1 then do;
Input var1
var2
var3;
end;
else do;
input var4
var5
var6
var7
var8
var9
var10
var11 ;
output;
end;
format var6 date9.;
datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;
Art, CEO, AnalystFinder.com
Assuming your input data is in a consistent format, that is - always 3 lines and all variables exist (even if some of them have missing value) then you can read the 3 lines with one INPUT staement, in a format of:
input
<first line variables> /
<second line variables> /
<third line variables>
;
The slash (/) tells sas to continue reading from next line.
No unfortunately I need the first three columns to "retain" to the next line for some of the rows:
I don't think the slashes would work. Also, the DSD option already changes the delimiter to a comma. I don't think the DLM="," option is needed...
OK, I see - you have a prefix line to concatenate with next one or more lines,
where each of them - except the prefix line - creates an output observation.
Then your code is perfct. I wouldn't change it.
No it goes like this:
Raw data:
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
Desired output:
A123456789 A 3 1 Y 15Feb1960 M M 3 FT 55000
A123456789 A 3 2 N 3Jun1965 F M 3 UE 0
The first three columns are "header" columns and they should retain to the next line for some of the rows.
That's correct 🙂
I don't like the fact that some columns are read in using formatted input when the data are separated by a comma:
Input @1 var1 $10.
@12 var2 $
var3 $
var4
The VAR1 column had to be re-read" in since I had the TEMP column there. Re-reading data cannot be done using list input and the only solution I can think of is to use the formatted input.
Just curious if there is a better way around...
Here is one possible alternative:
Data Yourdata;
infile datalines dsd delimiter=',';
informat var1 $10.;
informat var2 var3 $1.;
informat var4 8.;
informat var5 $1.;
informat var6 date9.;
informat var7 var8 $1.;
informat var9 8.;
informat var10 $2.;
informat var11 8.;
retain var1-var3;
input @;
if length(scan(_infile_,1,',')) ^= 1 then do;
Input var1
var2
var3;
end;
else do;
input var4
var5
var6
var7
var8
var9
var10
var11 ;
output;
end;
format var6 date9.;
datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;
Art, CEO, AnalystFinder.com
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.