DATA Step, Macro, Functions and more

Changing column name when you only know the order

Reply
Contributor
Posts: 44

Changing column name when you only know the order

Hi,

 

I'm importing a .xlsx file into SAS Studio using PROC Import. Unfortunately, the names in Excel are strange, and I'm unable to reference them in SAS.

 

I've run the following code:

proc sql; 
   describe table lender_history_2018_02_28;
  quit;

while yields the following results:

create table LENDER_HISTORY_2018_02_28( bufsize=65536 )
(
'CheckDate'n num format=MMDDYY10. label='CheckDate',
'CheckNumber'n char(5) format=$5. informat=$5. label='CheckNumber',
C num format=BEST. label='C',
D num format=BEST. label='D',
'Check Amount'n num format=NLMNY15.2 label='Check Amount',
'ServiceFees'n num format=NLMNY15.2 label='ServiceFees',
'Applied ToInterest'n num format=NLMNY15.2 label='Applied ToInterest',
'Applied ToPrincipal'n num format=NLMNY15.2 label='Applied ToPrincipal',
'Applied ToLate Charges'n num format=NLMNY15.2 label='Applied ToLate Charges',
'ChargesPrincipal'n num format=NLMNY15.2 label='ChargesPrincipal',
'ChargesInterest'n num format=NLMNY15.2 label='ChargesInterest',
'Applied ToPrepay Fee'n num format=NLMNY15.2 label='Applied ToPrepay Fee',
'OtherTaxable'n num format=NLMNY15.2 label='OtherTaxable',
'Applied ToNon-Taxable'n num format=NLMNY15.2 label='Applied ToNon-Taxable',
'OtherPayments'n num format=NLMNY15.2 label='OtherPayments',
'Applied ToTrust'n num format=NLMNY15.2 label='Applied ToTrust',
'LoanAccount'n char(10) format=$10. informat=$10. label='LoanAccount',
'Borrower Name'n char(45) format=$45. informat=$45. label='Borrower Name',
Notes char(78) format=$78. informat=$78. label='Notes',
'Check Memo'n char(20) format=$20. informat=$20. label='Check Memo',
'Created By'n char(9) format=$9. informat=$9. label='Created By',
'Date Created'n num format=DATETIME16. label='Date Created',
'Date Last Updated'n num format=DATETIME16. label='Date Last Updated',
'PaymentCode'n char(6) format=$6. informat=$6. label='PaymentCode',
'PaymentDate Due'n num format=MMDDYY10. label='PaymentDate Due',
'PaymentRcvd Date'n num format=MMDDYY10. label='PaymentRcvd Date'
);
 
75 quit;

As you can see, the variable names (e.g. the  first one "'CheckDate'n") don't make sense, and I'm unable to reference them in later code. Unfortunately we receive these files from a vendor, so I can't change them. Any ideas on how I could fix this? I think I could do one of two things:

1) Rename the variables based on their column number (the ordering is always the same)

2) Rename the variables to their label, as these seem reasonable

 

Any ideas how I could do the above?

 

Many thanks,

 

Mike

Super Contributor
Posts: 453

Re: Changing column name when you only know the order

Posted in reply to MikeFranz

Hi @MikeFranz,

 

If your table has variables created with spaces then you can rename them using the dictionary.columns 

 

PROC SQL;
select nliteral(name)||"="||COMPRESS(name) INTO: Rename separated by " "
from dictionary.columns
where libname="WORK" and memname="TEST";
run;
DATA WANT;
SET TEST(Rename=(&rename.));
run;

If your column name has other special characters like *&%$# then modify the COMPRESS function as needed. Also you can replace the variables names to Label names if the label names are valid, but we never know so try to convert it with variable names.

nliteral(name)||"="||COMPRESS(label) 

 

Thanks,
Suryakiran
Contributor
Posts: 44

Re: Changing column name when you only know the order

Posted in reply to SuryaKiran

@SuryaKiran Thanks for the response.

 

I've tried the following code:

PROC SQL;
	select nliteral(name)||"="||COMPRESS(name) INTO: Rename separated by " "
	from dictionary.columns
	where libname="WORK" and memname="lender_history_2018_01_31";
run;

with the following result:

 NOTE: No rows were selected.
 77         run;
 NOTE: PROC SQL statements are executed immediately; The RUN statement has no effect.

Any idea of whats happening?

Super Contributor
Posts: 453

Re: Changing column name when you only know the order

Posted in reply to MikeFranz

In Dictionary.columns the LIBNAME and MEMNAME values are in UPCASE so change it to upcase.

 

PROC SQL;
	select nliteral(name)||"="||COMPRESS(name) INTO: Rename separated by " "
	from dictionary.columns
	where libname="WORK" and memname=UPCASE("lender_history_2018_01_31");
run;
Thanks,
Suryakiran
Contributor
Posts: 44

Re: Changing column name when you only know the order

Posted in reply to SuryaKiran

@SuryaKiran Ah thank you. Ok, it now runs, with the following output:

"Check Date"N=Check Date
"Check Number"N=Check Number
C =C
D =D
"Check Amount"N=CheckAmount
"Service Fees"N=Service Fees
"Applied To Interest"N=AppliedTo Interest
"Applied To Principal"N=AppliedTo Principal
"Applied To Late Charges"N=AppliedTo LateCharges
"Charges Principal"N=Charges Principal
"Charges Interest"N=Charges Interest
"Applied To Prepay Fee"N=AppliedTo PrepayFee
"Other Taxable"N=Other Taxable
"Applied To Non-Taxable"N=AppliedTo Non-Taxable
"Other Payments"N=Other Payments
"Applied To Trust"N=AppliedTo Trust
"Loan Account"N=Loan Account
"Borrower Name"N=BorrowerName
Notes =Notes
"Check Memo"N=CheckMemo
"Created By"N=CreatedBy
"Date Created"N=DateCreated
"Date Last Updated"N=DateLastUpdated
"Payment Code"N=Payment Code
"Payment Date Due"N=Payment DateDue
"Payment Rcvd Date"N=Payment RcvdDate

Which I think is what we are looking to see.

 

However, when I run the final datastep:

DATA WANT;
	SET work.lender_history_2018_01_31(Rename=(&rename.));
run;

I get the following error:

 1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         DATA WANT;
 74         SET aaix.lender_history_2018_01_31(Rename=(&rename.));
 NOTE: Line generated by the macro variable "RENAME".
 74         "CheckDate"N=CheckDate "CheckNumber"N=CheckNumber C
                                       ________________
                                       79
 ERROR 79-322: Expecting a =.
 
 75         run;

I think it's the "'NAME'n" causing issues?

 

Thanks again,

 

Mike

 

 

 

Super Contributor
Posts: 453

Re: Changing column name when you only know the order

Posted in reply to MikeFranz

Did you use the COMPRESS function. 

"Check Date"N=Check Date
"Check Number"N=Check Number /* There should not be space between Check and Number */
C =C
D =D
Thanks,
Suryakiran
Super Contributor
Posts: 453

Re: Changing column name when you only know the order

Posted in reply to MikeFranz

In the variable names you have "-" so add that to COMPRESS function to remove. If you have any others characters then add them to list. Also add the third argument "s" to COMPRESS for removing space characters (blank, horizontal tab, vertical tab, carriage return, line feed, and form feed)

proc sql;
create table LENDER_HISTORY_2018_02_28( bufsize=65536 )
(
'CheckDate'n num format=MMDDYY10. label='CheckDate',
'CheckNumber'n char(5) format=$5. informat=$5. label='CheckNumber',
C num format=BEST. label='C',
D num format=BEST. label='D',
'Check Amount'n num format=NLMNY15.2 label='Check Amount',
'ServiceFees'n num format=NLMNY15.2 label='ServiceFees',
'Applied ToInterest'n num format=NLMNY15.2 label='Applied ToInterest',
'Applied ToPrincipal'n num format=NLMNY15.2 label='Applied ToPrincipal',
'Applied ToLate Charges'n num format=NLMNY15.2 label='Applied ToLate Charges',
'ChargesPrincipal'n num format=NLMNY15.2 label='ChargesPrincipal',
'ChargesInterest'n num format=NLMNY15.2 label='ChargesInterest',
'Applied ToPrepay Fee'n num format=NLMNY15.2 label='Applied ToPrepay Fee',
'OtherTaxable'n num format=NLMNY15.2 label='OtherTaxable',
'Applied ToNon-Taxable'n num format=NLMNY15.2 label='Applied ToNon-Taxable',
'OtherPayments'n num format=NLMNY15.2 label='OtherPayments',
'Applied ToTrust'n num format=NLMNY15.2 label='Applied ToTrust',
'LoanAccount'n char(10) format=$10. informat=$10. label='LoanAccount',
'Borrower Name'n char(45) format=$45. informat=$45. label='Borrower Name',
Notes char(78) format=$78. informat=$78. label='Notes',
'Check Memo'n char(20) format=$20. informat=$20. label='Check Memo',
'Created By'n char(9) format=$9. informat=$9. label='Created By',
'Date Created'n num format=DATETIME16. label='Date Created',
'Date Last Updated'n num format=DATETIME16. label='Date Last Updated',
'PaymentCode'n char(6) format=$6. informat=$6. label='PaymentCode',
'PaymentDate Due'n num format=MMDDYY10. label='PaymentDate Due',
'PaymentRcvd Date'n num format=MMDDYY10. label='PaymentRcvd Date'
);
quit;

PROC SQL;
select nliteral(name)||"="||COMPRESS(name,'-','s') INTO: Rename separated by " "
from dictionary.columns
where libname="WORK" and MEMNAME="LENDER_HISTORY_2018_02_28";
quit;
 
DATA WANT;
SET LENDER_HISTORY_2018_02_28(Rename=(&rename));
Run;

It works fine for me, let me know if you have any issues 

Thanks,
Suryakiran
Regular Contributor
Posts: 195

Re: Changing column name when you only know the order

Posted in reply to MikeFranz
Try using
options validvarname=v7;
before proc import.
Ask a Question
Discussion stats
  • 7 replies
  • 91 views
  • 2 likes
  • 3 in conversation