I import data from a csv to use proc format on it. I am using my college data regarding the subjects we chose. Here, we had put abbreviations instead of the full name and since I was learning about proc format, I thought "why not?". Here is the code for what I did along with the data copied from csv
sys_no sub1 sub2 sub3 sub4
1, DL, SAS, Big Data analytics, Stock Market Operations
3, DL, SAS, Big Data analytics, Stock Market Operations
4, DL, SAS, Big Data Analytics, Stock Market Operations
27, DL, SAS, Big data analytics, Stock Market Operations
12, DL, SAS, Big Data Analytics, Stock Market Operations
24, Dl,SAS, BIG DATA ANALYTICS, STOCK MARKET
23 ,dL, sas, BIG DATA ANALYTICS, STOCK MARKET
Yes it is in comma and I used this code to import it:
PROC IMPORT OUT=Student
DATAFILE="/home/Student.csv"
DBMS=CSV
REPLACE;
GETNAMES=YES;
RUN;
data B;
set work.student;
format sub1 $upcase.;
rename sub1=subA sub2=subB sub3=subC sub4=subD;
run;
I noticed that the upcase doesn't work is I put format below the rename (of course I changed the name when using format) which I resolved, but I don't understand why this doesn't work:
proc format;
value $subA
'DL'='Deep Learning';
run;
Here, the 'dL' and 'Dl' are not uppercased even though when I check it, they're upper cased before the proc format. I found out that I must do this instead of using the format $upcase. :
data C;
set B;
subA = upcase(subA);
run;
I am new to SAS and would like to know what error I made
Attaching an INFORMAT to a variable after the value is already in the variable does nothing (other then provide some documentation for future users of the dataset about how you intended to create the values in the dataset.)
Remember that informats convert text to values and formats convert values to text. So unless you are doing some conversion (such as when formats are applied to the data values are they are printed) then they have no effect.
To use the $UPCASE informat on existing data you would need to pass the data thru the INPUT() function.
data work.C;
set B;
subA = input(suba,$subA.);
run;
I noticed that the upcase doesn't work is I put format below the rename (of course I changed the name when using format) which I resolved, but I don't understand why this doesn't work:
proc format;
value $subA
'DL'='Deep Learning';
run;
Your code never uses format $subA anywhere.
Here, the 'dL' and 'Dl' are not uppercased even though when I check it, they're upper cased before the proc format.
All formats, like $upcase, change the appearance of the values. They do not change the actual values. SAS always works on the actual values and not the formatted values, and so your format $subA will only work on DL, it will not work on Dl or dL, these are the actual values. So you have not told SAS to assign a format to Dl or dL. When you want to change the actual values, you need the upcase() function and not the $upcase format.
You can solve several issues by use of a data step to read the file.
When you use a data step you can specify custom INFORMATS to read data, and the INVALUE statement supports the UPCASE option to convert the read text to uppercase before applying any result. Also you can set the names of the variables.
Example:
proc format library=work; invalue $subA (upcase ) 'DL' ='Deep Learning' ; invalue $subB (upcase ) 'SAS' ='SAS' ; invalue $subc (upcase ) 'BIG DATA ANALYTICS' ='BIG DATA ANALYTICS' ; run; data work.student; /* the infile would point to your file*/ infile datalines dlm=',' firstobs=2; informat sys_no 8. suba $suba. subb $subb. subc $subc.; input sys_no suba subb subc; datalines; sys_no,sub1,sub2,sub3,sub4 1,DL,SAS,Big Data analytics,Stock Market Operations 3,DL,SAS,Big Data analytics,Stock Market Operations 4,DL,SAS,Big Data Analytics,Stock Market Operations 27,DL,SAS,Big data analytics,Stock Market Operations 12,DL,SAS,Big Data Analytics,Stock Market Operations 24,Dl,SAS,BIG DATA ANALYTICS,STOCK MARKET 23,dL,sas,BIG DATA ANALYTICS,STOCK MARKET ;
I did not include SubD as I was not sure whether both "Stock Market Operations" and "STOCK MARKET" should be treated the same or differently. Note that the values before the = sign in the Invalue statement are the "in the file" values but in upper case as that result will be used after the UPCASE option is applied to assign values.
This works best if know all the categories to be in a variable. At which point adding "other = _error_" at the end of the Invalue means you get an invalid data note in the log just like trying to read "69NOV1985" with a date9 informat.
PLEASE post any example text data into a text box opened on the forum with the </> icon above the message window. The forum reformats text pasted into the message window and your "csv" example has some other characters besides a space following the commas (which is poor practice in general with CSV files)
Hello, I tried doing your code and it works! But I'm unable to do it with the data I input from the csv, could you give me some pointers as I do not know why it doesn't work (It is not being up-cased)? The only difference I'm able to find after a lot of trial and error is that I am doing the informat when parsing the code which I give through cards and not with the data I input from the csv. Here is the code:
proc format library=work;
invalue $subA (upcase )
'DL' ='Deep Learning'
;
run;
data work.C;
set B;
informat sys_no 8. subA $subA.;
run;
Attaching an INFORMAT to a variable after the value is already in the variable does nothing (other then provide some documentation for future users of the dataset about how you intended to create the values in the dataset.)
Remember that informats convert text to values and formats convert values to text. So unless you are doing some conversion (such as when formats are applied to the data values are they are printed) then they have no effect.
To use the $UPCASE informat on existing data you would need to pass the data thru the INPUT() function.
data work.C;
set B;
subA = input(suba,$subA.);
run;
If you want the format $SUBA to display DL,dl,Dl, and dL as Deep Learning then say so when you define the format.
proc format;
value $subA
'DL','dl','dL','Dl'='Deep Learning';
run;
Otherwise you will need to first convert the values to uppercase (not just PRINT them using uppercase) so that the actual value is DL.
data B;
set work.student;
sub1=upcase(sub1);
format sub1 $suba. ;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.