Help using Base SAS procedures

convert data from improperly data format

Reply
Contributor
Posts: 22

convert data from improperly data format

Hello All,

I have an unfriendly data format (the file has same info in a row and has empty lines). I tried to convert it into a usable data. Please read and advise how I could make it work. At this time, I couldn't import the data in and not sure how to skip the blank line to continue reading next line. Thanks for your time and Happy Friday to All.

My data (test.csv) looks like:

Company_NameA,Company_NameB,Company_NameC
Attn_NameA,Attn_NameB,Attn_NameC
AddressA,AddressB,AddressC

Company_NameD,Company_NameE,Company_NameF
Attn_NameD,Attn_NameE,Attn_NameF
AddressD,AddressE,AddressF

Company_NameG,Company_NameH,Company_NameK
Attn_NameG,Attn_NameH,Attn_NameK
AddressG,AddressH,AddressK


I'd like to convert it into a data set:

Company_NameA Attn_NameA AddressA
Company_NameB Attn_NameB AddressB
Company_NameC Attn_NameC AddressC
Company_NameD Attn_NameD AddressD
Company_NameE Attn_NameE AddressE
Company_NameF Attn_NameF AddressF
Company_NameG Attn_NameG AddressG
Company_NameH Attn_NameH AddressH
Company_NameK Attn_NameK AddressK

My intention is to create 3 tables for company, attn_name, and address.
Also, I use a keyid for merging them later.

data company (keep=keyid comp_name)
attn (keep=keyid attn_name)
address (keep=keyid address);

attrib keyid length=$5.;
array comp_name[3] $50 comp_name1-comp_name3;
array attn_name[3] $50 attn1-attn3;
array address[3] $50 address1-address3;

infile "C:\data\test.csv" dsd dlm=",";
input @1 inrec $char256;
if length(inrec)>0 then do; *** if no empty row;
rec+1;
keyid=put(rec,z5.);
do i=1 to 3;
comp_name = scan(inrec,i,',');
output company;
attn_name = scan(inrec,i,',');
output attn;
address = scan(inrec,i,',');
output address;
end;
else

end;
run;


***** log file *****
NOTE: Invalid data for char256 in line 10 12-21.
10 Attn_NameG,Attn_NameH,Attn_NameK 32
keyid=00008 comp_name1=Attn_Nam comp_name2= comp_name3= attn1=Attn_Nam attn2= attn3=
address1=Attn_Nam address2= address3= inrec=Attn_Nam char256=. rec=8 i=4 _ERROR_=1 _N_=8
NOTE: Invalid data for char256 in line 11 10-17.
11 AddressG,AddressH,AddressK 26
keyid=00009 comp_name1=AddressG comp_name2= comp_name3= attn1=AddressG attn2= attn3=
address1=AddressG address2= address3= inrec=AddressG char256=. rec=9 i=4 _ERROR_=1 _N_=9
NOTE: 11 records were read from the infile "C:\data\test.csv".
The minimum record length was 0.
The maximum record length was 41.
NOTE: SAS went to a new line when INPUT statement reached past the end of a line.
NOTE: The data set WORK.COMPANY has 27 observations and 1 variables.
NOTE: The data set WORK.ATTN has 27 observations and 1 variables.
NOTE: The data set WORK.ADDRESS has 0 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds
Respected Advisor
Posts: 3,799

Re: convert data from improperly data format

See what you think about this....

[pre]
filename FT15F001 temp;
data comp(keep=Company attn_name address);
infile FT15F001 dsd eof=eof;
array comp[3] $50;
array attn[3] $50;
array addr[3] $50;
input #1 comp
  • #2 attn
  • #3 addr
  • ;
    do i = 1 to dim(comp);
    company = comp[ i ];
    attn_name = attn[ i ];
    address = addr[ i ];
    output;
    end;
    input;
    return;
    eof:
    stop;
    parmcards;
    Company_NameA,Company_NameB,Company_NameC
    Attn_NameA,Attn_NameB,Attn_NameC
    AddressA,AddressB,AddressC

    Company_NameD,Company_NameE,Company_NameF
    Attn_NameD,Attn_NameE,Attn_NameF
    AddressD,AddressE,AddressF

    Company_NameG,Company_NameH,Company_NameK
    Attn_NameG,Attn_NameH,Attn_NameK
    AddressG,AddressH,AddressK
    ;;;;
    run;
    proc print;
    run;
    [/pre] Message was edited by: data _null_;
  • Contributor
    Posts: 22

    Re: convert data from improperly data format

    Posted in reply to data_null__
    Hi data _null_,

    Thanks for your kind response. I put your codes in to test and I got the followings:

    1274 filename FT15F001 temp;
    1275 data comp(keep=company attn_name address);
    1276 infile FT15F001 dsd eof=eof;
    1277 array comp[3] $50;
    1278 array attn[3] $50;
    1279 array addr[3] $50;
    1280 input #1 comp
  • #2 attn
  • #3 addr
  • ;
    1281 do i=1 to dim(comp);
    1282 company = comp;
    ERROR: Illegal reference to the array comp.
    1283 attn_name = attn;
    ERROR: Illegal reference to the array attn.
    1284 address = addr;
    ERROR: Illegal reference to the array addr.
    1285 output;
    1286 end;
    1287 input;
    1288 return;
    1289 eof:
    1290 stop;
    1291 parmcards;
    1301 ;;;;

    NOTE: The SAS System stopped processing this step because of errors.
    WARNING: The data set WORK.COMP may be incomplete. When this step was stopped there were 0
    observations and 3 variables.
    WARNING: Data set WORK.COMP was not replaced because this step was stopped.
    NOTE: DATA statement used (Total process time):
    real time 0.01 seconds
    cpu time 0.00 seconds


    1302 run;
    1303 proc print;
    1304 run;

    NOTE: No observations in data set WORK.COMP.
    NOTE: PROCEDURE PRINT used (Total process time):
    real time 0.00 seconds
    cpu time 0.00 seconds
  • Respected Advisor
    Posts: 3,799

    Re: convert data from improperly data format

    I wondered why some of the post got italized. The square bracked I. I looked but did not see the problem. I change the program to use _N_ for array index.

    [pre]
    filename FT15F001 temp;
    data comp(keep=Company attn_name address);
    infile FT15F001 dsd eof=eof;
    array comp[3] $50;
    array attn[3] $50;
    array addr[3] $50;
    input #1 comp
  • #2 attn
  • #3 addr
  • ;
    do _n_ = 1 to dim(comp);
    company = comp[_n_];
    attn_name = attn[_n_];
    address = addr[_n_];
    output;
    end;
    input;
    return;
    eof:
    stop;
    parmcards;
    Company_NameA,Company_NameB,Company_NameC
    Attn_NameA,Attn_NameB,Attn_NameC
    AddressA,AddressB,AddressC

    Company_NameD,Company_NameE,Company_NameF
    Attn_NameD,Attn_NameE,Attn_NameF
    AddressD,AddressE,AddressF

    Company_NameG,Company_NameH,Company_NameK
    Attn_NameG,Attn_NameH,Attn_NameK
    AddressG,AddressH,AddressK
    ;;;;
    run;
    proc print;
    run;
    [/pre] Added space to blank lines in data to get them to display in the post.


    Message was edited by: data _null_;
  • Contributor
    Posts: 22

    Re: convert data from improperly data format

    Posted in reply to data_null__
    data _null_,

    Thank you very much!
    Ask a Question
    Discussion stats
    • 4 replies
    • 153 views
    • 0 likes
    • 2 in conversation