Want to create a SAS table of the form:
date | customer_id | segment_generation_1 | segment_generation_2 |
2023-01-15 | 111 | REPLACED BY GENERATION 2 | B |
2023-01-15 | 222 | REPLACED BY GENERATION 2 | B |
2023-01-15 | 333 | REPLACED BY GENERATION 2 | B |
SAS code I tried (that does not work):
data test;
input date :yymmdd10. customer_id segment_generation_1 $char100. segment_generation_2 $ ;
format date yymmdd10.;
datalines;
2023-01-15 111 REPLACED_BY_GENERATION_2 B
2023-01-15 222 REPLACED_BY_GENERATION_2 B
2023-01-15 333 REPLACED_BY_GENERATION_2 B
;
run;
Any advice?
data test;
LENGTH date 8 customer_id 8 segment_generation_1 $ 100 segment_generation_2 $ 1 ;
input date :yymmdd10. customer_id segment_generation_1 $ segment_generation_2 $ ;
format date yymmdd10.;
datalines;
2023-01-15 111 REPLACED_BY_GENERATION_2 B
2023-01-15 222 REPLACED_BY_GENERATION_2 B
2023-01-15 333 REPLACED_BY_GENERATION_2 B
;
run;
proc print; run;
/* end of program */
data test;
LENGTH date 8 customer_id 8 segment_generation_1 $ 100 segment_generation_2 $ 1 ;
input date :yymmdd10. customer_id segment_generation_1 $ segment_generation_2 $ ;
format date yymmdd10.;
datalines;
2023-01-15 111 REPLACED_BY_GENERATION_2 B
2023-01-15 222 REPLACED_BY_GENERATION_2 B
2023-01-15 333 REPLACED_BY_GENERATION_2 B
;
run;
proc print; run;
/* end of program */
You need to use LIST MODE input instead of FORMATTED MODE input to read variable length strings from text lines.
You are already doing that for the DATE variable by using the colon modifier before the informat sepecified in the INPUT statement. So you could try doing the same thing for your other variable. You also probably want to use the normal $ informat and not the specialized $CHAR informat. I doubt that you want to preserve leading spaces in the values, or treat override the default behaviour of treating as single period as indicating a missing value.
input date :yymmdd10. customer_id segment_generation_1 :$100. segment_generation_2 $ ;
For your example data lines you could also use FORMATTED MODE or COLUMN MODE since the fields appear to be aligned nicely into fixed positions on the line. In that case you need to actually DEFINE the variable before the INPUT statement, instead of forcing SAS to GUESS how you wanted it to define the variable.
NOTE: You also should NEVER indent the lines of data!! To remind yourself do not indent the DATALINES (aka CARDS) statement either.
data test;
length date 8 customer_id 8 segment_generation_1 segment_generation_2 $100;
input date yymmdd10. customer_id 5. segment_generation_1 $25.. segment_generation_2 $25.;
* input @1 date yymmdd10. customer_id 12-15 segment_generation_1 $ 16-40 segment_generation_2 $ 41-55 ;
format date yymmdd10.;
datalines;
2023-01-15 111 REPLACED_BY_GENERATION_2 B
2023-01-15 222 REPLACED_BY_GENERATION_2 B
2023-01-15 333 REPLACED_BY_GENERATION_2 B
;
If you have data lines that do not align to fixed column you will probably have trouble also with values like your example listing that have embedded spaces in the value. You probably will want to use a different delimiter than a space so you can use DSD option. That also will support adding quotes in the text around values that happen to include the delimiter.
Notice also that if you have defined all of the variables (and in the order they appear in the data lines) the INPUT statement can be much simpler.
data test;
infile datalines dsd truncover;
length date 8 customer_id 8 segment_generation_1 segment_generation_2 $100;
informat date yymmdd10.;
format date yymmdd10.;
input date -- segment_generation_2;
datalines;
2023-01-15,111,REPLACED_BY_GENERATION_2,B
2023-01-15,222,"This value was replaced, by generation2",B
2023-01-15,333,REPLACED_BY_GENERATION_2,B
;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.