I'm trying to read in this set of data and I need some help:
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
I need the output to look like this:
I have come up with this solution which works fine:
Data Yourdata (drop=temp);
Infile datalines dsd;
retain var1-var3;
input temp : $10. @;
if length(temp) ^= 1 then do;
Input @1 var1 $10.
@12 var2 $
var3 $
var4
var5 $
var6 : date9.
var7 $
var8 $
var9
var10 $
var11 ;
end; else do;
Input @1 var4
var5 $
var6 : date9.
var7 $
var8 $
var9
var10 $
var11 ;
end;
format var6 date9.;
datalines;
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
A135790234,B,1
1,Y,19Oct1944,F,D,0,NA,3000
B234523456,A,2
1,Y,30Jun1978,F,M,1,PT,4000
2,N,21May1975,M,M,2,FT,30000
;
run;
However, I'm partially using formatted input instead of list input and I'm ignoring the delimiter for some of the columns. I am wondering if there is a better solution for this.
Can anyone help? Thanks 🙂
Here is one possible alternative:
Data Yourdata; infile datalines dsd delimiter=','; informat var1 $10.; informat var2 var3 $1.; informat var4 8.; informat var5 $1.; informat var6 date9.; informat var7 var8 $1.; informat var9 8.; informat var10 $2.; informat var11 8.; retain var1-var3; input @; if length(scan(_infile_,1,',')) ^= 1 then do; Input var1 var2 var3; end; else do; input var4 var5 var6 var7 var8 var9 var10 var11 ; output; end; format var6 date9.; datalines; A123456789,A,3 1,Y,15Feb1960,M,M,3,FT,55000 2,N,3Jun1965,F,M,3,UE,0 A135790234,B,1 1,Y,19Oct1944,F,D,0,NA,3000 B234523456,A,2 1,Y,30Jun1978,F,M,1,PT,4000 2,N,21May1975,M,M,2,FT,30000 ; run;
Art, CEO, AnalystFinder.com
Assuming your input data is in a consistent format, that is - always 3 lines and all variables exist (even if some of them have missing value) then you can read the 3 lines with one INPUT staement, in a format of:
input
<first line variables> /
<second line variables> /
<third line variables>
;
The slash (/) tells sas to continue reading from next line.
No unfortunately I need the first three columns to "retain" to the next line for some of the rows:
I don't think the slashes would work. Also, the DSD option already changes the delimiter to a comma. I don't think the DLM="," option is needed...
OK, I see - you have a prefix line to concatenate with next one or more lines,
where each of them - except the prefix line - creates an output observation.
Then your code is perfct. I wouldn't change it.
No it goes like this:
Raw data:
A123456789,A,3
1,Y,15Feb1960,M,M,3,FT,55000
2,N,3Jun1965,F,M,3,UE,0
Desired output:
A123456789 A 3 1 Y 15Feb1960 M M 3 FT 55000
A123456789 A 3 2 N 3Jun1965 F M 3 UE 0
The first three columns are "header" columns and they should retain to the next line for some of the rows.
That's correct 🙂
I don't like the fact that some columns are read in using formatted input when the data are separated by a comma:
Input @1 var1 $10.
@12 var2 $
var3 $
var4
The VAR1 column had to be re-read" in since I had the TEMP column there. Re-reading data cannot be done using list input and the only solution I can think of is to use the formatted input.
Just curious if there is a better way around...
Here is one possible alternative:
Data Yourdata; infile datalines dsd delimiter=','; informat var1 $10.; informat var2 var3 $1.; informat var4 8.; informat var5 $1.; informat var6 date9.; informat var7 var8 $1.; informat var9 8.; informat var10 $2.; informat var11 8.; retain var1-var3; input @; if length(scan(_infile_,1,',')) ^= 1 then do; Input var1 var2 var3; end; else do; input var4 var5 var6 var7 var8 var9 var10 var11 ; output; end; format var6 date9.; datalines; A123456789,A,3 1,Y,15Feb1960,M,M,3,FT,55000 2,N,3Jun1965,F,M,3,UE,0 A135790234,B,1 1,Y,19Oct1944,F,D,0,NA,3000 B234523456,A,2 1,Y,30Jun1978,F,M,1,PT,4000 2,N,21May1975,M,M,2,FT,30000 ; run;
Art, CEO, AnalystFinder.com
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.