Hello All,
I have an unfriendly data format (the file has same info in a row and has empty lines). I tried to convert it into a usable data. Please read and advise how I could make it work. At this time, I couldn't import the data in and not sure how to skip the blank line to continue reading next line. Thanks for your time and Happy Friday to All.
My data (test.csv) looks like:
Company_NameA,Company_NameB,Company_NameC
Attn_NameA,Attn_NameB,Attn_NameC
AddressA,AddressB,AddressC
Company_NameD,Company_NameE,Company_NameF
Attn_NameD,Attn_NameE,Attn_NameF
AddressD,AddressE,AddressF
Company_NameG,Company_NameH,Company_NameK
Attn_NameG,Attn_NameH,Attn_NameK
AddressG,AddressH,AddressK
I'd like to convert it into a data set:
Company_NameA Attn_NameA AddressA
Company_NameB Attn_NameB AddressB
Company_NameC Attn_NameC AddressC
Company_NameD Attn_NameD AddressD
Company_NameE Attn_NameE AddressE
Company_NameF Attn_NameF AddressF
Company_NameG Attn_NameG AddressG
Company_NameH Attn_NameH AddressH
Company_NameK Attn_NameK AddressK
My intention is to create 3 tables for company, attn_name, and address.
Also, I use a keyid for merging them later.
data company (keep=keyid comp_name)
attn (keep=keyid attn_name)
address (keep=keyid address);
attrib keyid length=$5.;
array comp_name[3] $50 comp_name1-comp_name3;
array attn_name[3] $50 attn1-attn3;
array address[3] $50 address1-address3;
infile "C:\data\test.csv" dsd dlm=",";
input @1 inrec $char256;
if length(inrec)>0 then do; *** if no empty row;
rec+1;
keyid=put(rec,z5.);
do i=1 to 3;
comp_name = scan(inrec,i,',');
output company;
attn_name = scan(inrec,i,',');
output attn;
address = scan(inrec,i,',');
output address;
end;
else
end;
run;
***** log file *****
NOTE: Invalid data for char256 in line 10 12-21.
10 Attn_NameG,Attn_NameH,Attn_NameK 32
keyid=00008 comp_name1=Attn_Nam comp_name2= comp_name3= attn1=Attn_Nam attn2= attn3=
address1=Attn_Nam address2= address3= inrec=Attn_Nam char256=. rec=8 i=4 _ERROR_=1 _N_=8
NOTE: Invalid data for char256 in line 11 10-17.
11 AddressG,AddressH,AddressK 26
keyid=00009 comp_name1=AddressG comp_name2= comp_name3= attn1=AddressG attn2= attn3=
address1=AddressG address2= address3= inrec=AddressG char256=. rec=9 i=4 _ERROR_=1 _N_=9
NOTE: 11 records were read from the infile "C:\data\test.csv".
The minimum record length was 0.
The maximum record length was 41.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.COMPANY has 27 observations and 1 variables.
NOTE: The data set WORK.ATTN has 27 observations and 1 variables.
NOTE: The data set WORK.ADDRESS has 0 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds