Please learn how to use the Insert Code and Insert SAS Code pop-up windows.
Is your data all on one line?
AD:1/1234,20 AT:2/23456,35 AU:3/34567,59 CA:4/56789,05 CH:5/54321,55; DE:3/34567,59 ES:12/2341,25;
Or on separate lines?
AD:1/1234,20 AT:2/23456,35 AU:3/34567,59 CA:4/56789,05 CH:5/54321,55; DE:3/34567,59 ES:12/2341,25;
Could one "group" span multiple lines?
AD:1/1234,20 AT:2/23456,35 AU:3/34567,59 CA:4/56789,05 CH:5/54321,55; DE:3/34567,59 ES:12/2341,25;
How long is the longest line? If you just read the whole file SAS will tell you. Here is a simple example:
1 options parmcards=csv; 2 filename csv temp; 3 parmcards4; 7 ;;;; 8 9 data _null_; 10 infile csv ; 11 input; 12 run; NOTE: The infile CSV is: Filename=C:\Users\ABERNA~1\AppData\Local\Temp\1\SAS Temporary Files\_TD23280_AMRAPY3WVP0VKU0_\#LN00009, RECFM=V,LRECL=32767,File Size (bytes)=102, Last Modified=12Feb2024:10:39:49, Create Time=12Feb2024:10:39:49 NOTE: 3 records were read from the infile CSV. The minimum record length was 28. The maximum record length was 40. NOTE: DATA statement used (Total process time): real time 0.03 seconds cpu time 0.00 seconds
As you can see there where 3 lines of data and the longest line was 40 bytes and the shortest was 28 bytes.
If the lines are short (less then 32k bytes per line) then you can check for the semicolon.
Your files appears to consist of space delimited values that each consist of three internal values. Let's call these a TRIPLET. So here is a program to create the data with an additional GROUP variable that counts how man semicolon group separators have been seen.
data test;
infile csv dlm=' ' ;
retain group 1;
length triplet $40 area $2 number value 8;
input triplet @@;
area = scan(triplet,1,':');
number = input(scan(triplet,2,':/'),commax32.);
value = input(scan(triplet,3,':/;'),commax32.);
output;
if indexc(triplet,';') then group+1;
drop triplet;
run;
Result
Obs group area number value 1 1 AD 1 1234.20 2 1 AT 2 23456.35 3 1 AU 3 34567.59 4 1 CA 4 56789.05 5 1 CH 5 54321.55 6 2 DE 3 34567.59 7 2 ES 12 2341.25
yes, my data is all on one line.
my data line is linesize=32767 bytes long. Your Codes skip a few semicolons.
AD:1/1234,20 AT:2/23456,35 AU:3/34567,59 CA:4/56789,05 CH:5/54321,55;DE:3/34567,59 ES:12/2341,25;AD:1/205,92;DE:25/2039,05;
32,767 is the default LRECL. So your actual file might have lines longer than that. You can use LRECL= values up to 10 million or so depending on your machine.
But if the file is not intended to have lines then just use RECFM=N option on the INFILE statement and SAS will not try to look for lines. Instead it will treat the file as one long stream.
So these lines are different than your previous examples. The spaces between TRIPLETS is missing sometimes. If the issue is that you either have a space or a semicolon between the triplets and not both then something like this should work.
data test;
infile csv dlm=':/ ;' recfm=n;
retain group 1;
input area :$2. number value :commax. +(-1) sep $1. ;
output;
if sep=';' then group+1;
run;
For your new example file
options parmcards=csv;
filename csv temp;
parmcards4;
AD:1/1234,20 AT:2/23456,35 AU:3/34567,59 CA:4/56789,05 CH:5/54321,55;DE:3/34567,59 ES:12/2341,25;AD:1/205,92;DE:25/2039,05;
;;;;
This is the results
Obs group area number value sep 1 1 AD 1 1234.20 2 1 AT 2 23456.35 3 1 AU 3 34567.59 4 1 CA 4 56789.05 5 1 CH 5 54321.55 ; 6 2 DE 3 34567.59 7 2 ES 12 2341.25 ; 8 3 AD 1 205.92 ; 9 4 DE 25 2039.05 ;
thank you. It does what I want 🙂
sorry my mistake. It worked :-). Thank you very much.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.