Hello All,
I have a raw file with data like this:
H 1095 NJ 06DEC84
C 01DEC11 $45.0
C 01AUG11 $37.5
H 1096 CA 01SEP83
C 01JUL11 $156.7
H 1097 VG 07JUL74
C 01FEB11 $109.5
H 1099 OT 13FEB79
C 01Mar11 $45.0
C 01May11 $45.0
H 1202 CA 01JAN81
C 01May11 $37.0
C 01Oct11 $45.0
C 01Feb11 $109.5
H 1090 PT 01DEC11
C 01MAR11 $45.0
C 01MAY11 $45.0
representing client ID, State, DOB and transaction details.
I need the output to be like this:
ClientID State DateofTransaction Amount
1095 NJ 01DEC11 $45.0
1095 NJ 01AUG11 $37.5
1096 CA 01JUL11 $156.7
1097 VG 01FEB11 $109.5
1099 OT 01Mar11 $45.0
1099 OT 01May11 $45.0
1202 CA 01May11 $37.0
1202 CA 01Oct11 $45.0
1202 CA 01Feb11 $109.5
1090 PT 01MAR11 $45.0
1090 PT 01MAY11 $45.0
I have used the following:
Data Transact;
infile hca dlm = ' ' missover lrecl=100;
input @3 ClientID$ 4. @8 State$ 2. #2 @3 DOT Date7. @11 Amount$;
format DOT Date7.;
run;
And the result I'm getting is attached herewith.
Now I know I need to do something with the repeated transactions as in more than one transaction for each client.
What should I do?
Please see:
The datalines is to read inline data. Your code wouldn't need that but would want the INFILE statement.
RETAIN is the instruction to keep variables across records. The explicit output statement when the line starts with something other than 'H' means that is the only time the records are written to the output set and won't get the line with the H code.
data work.want; input code $ @; informat code $1. clientid $4. state $2. dateoftransaction date7. amount comma9.2; retain clientid state; if code = 'H' then do; input ClientId State; end; else do; input dateoftransaction amount; output; end; drop code; format dateoftransaction date7. amount dollar9.2; datalines; H 1095 NJ 06DEC84 C 01DEC11 $45.0 C 01AUG11 $37.5 H 1096 CA 01SEP83 C 01JUL11 $156.7 H 1097 VG 07JUL74 C 01FEB11 $109.5 H 1099 OT 13FEB79 C 01Mar11 $45.0 C 01May11 $45.0 H 1202 CA 01JAN81 ;
Please see:
The datalines is to read inline data. Your code wouldn't need that but would want the INFILE statement.
RETAIN is the instruction to keep variables across records. The explicit output statement when the line starts with something other than 'H' means that is the only time the records are written to the output set and won't get the line with the H code.
data work.want; input code $ @; informat code $1. clientid $4. state $2. dateoftransaction date7. amount comma9.2; retain clientid state; if code = 'H' then do; input ClientId State; end; else do; input dateoftransaction amount; output; end; drop code; format dateoftransaction date7. amount dollar9.2; datalines; H 1095 NJ 06DEC84 C 01DEC11 $45.0 C 01AUG11 $37.5 H 1096 CA 01SEP83 C 01JUL11 $156.7 H 1097 VG 07JUL74 C 01FEB11 $109.5 H 1099 OT 13FEB79 C 01Mar11 $45.0 C 01May11 $45.0 H 1202 CA 01JAN81 ;
Data Transact;
infile cards dlm = ' ' missover ;
retain ClientID State;
input temp $ @;
if temp="H" then input @3 ClientID $4. @8 State $2. ;
else input @3 DOT Date7. @11 Amount$;
if temp ne 'H';
drop temp;
format DOT Date7.;
cards;
H 1095 NJ 06DEC84
C 01DEC11 $45.0
C 01AUG11 $37.5
H 1096 CA 01SEP83
C 01JUL11 $156.7
H 1097 VG 07JUL74
C 01FEB11 $109.5
H 1099 OT 13FEB79
C 01Mar11 $45.0
C 01May11 $45.0
H 1202 CA 01JAN81
C 01May11 $37.0
C 01Oct11 $45.0
C 01Feb11 $109.5
H 1090 PT 01DEC11
C 01MAR11 $45.0
C 01MAY11 $45.0
;
You might want to examine what the output for your code looks like when the clientid is not exactly 4 characters such as:
cards; H 10 NJ 06DEC84
May not be an issue here but is a point to consider when you determine what and why it happens.
@ballardw Fully agree sir. However, all i did was a copy paste and make some adjustment to OP's code to make it work. Albeit, I didn't see your code posted before I posted mine. Had I took notice of yours, I wouldn't have posted mine.
@novinosrin wrote:
@ballardw Fully agree sir. However, all i did was a copy paste and make some adjustment to OP's code to make it work. Albeit, I didn't see your code posted before I posted mine. Had I took notice of yours, I wouldn't have posted mine.
No reason to assume that I am right.
There are many subtle errors that crop up with reading delimited data using a fixed width input statements that sometimes aren't caught with small example data sets.
Since I learned SAS with mostly fixed column input and reading files with implied decimals and such I may be hypersensitive to use of input statements with formats that could read what appears to be delimited data as fixed column.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.