BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
nubhi
Fluorite | Level 6

I import data from a csv to use proc format on it. I am using my college data regarding the subjects we chose. Here, we had put abbreviations instead of the full name and since I was learning  about proc format, I thought "why not?". Here is the code for what I did along with the data copied from csv 

sys_no sub1 sub2 sub3 sub4
1, DL, SAS, Big Data analytics, Stock Market Operations
3, DL, SAS, Big Data analytics, Stock Market Operations
4, DL, SAS, Big Data Analytics, Stock Market Operations
27, DL, SAS, Big data analytics, Stock Market Operations
12, DL, SAS, Big Data Analytics, Stock Market Operations
24, Dl,SAS, BIG DATA ANALYTICS, STOCK MARKET
23 ,dL, sas, BIG DATA ANALYTICS, STOCK MARKET

 

Yes it is in comma and I used this code to import it:

PROC IMPORT OUT=Student
DATAFILE="/home/Student.csv"
DBMS=CSV
REPLACE;
GETNAMES=YES;
RUN;

 

data B;
set work.student;
format sub1 $upcase.;
rename sub1=subA sub2=subB sub3=subC sub4=subD;
run;

 

I noticed that the upcase doesn't work is I put format below the rename (of course I changed the name when using format) which I resolved, but I don't understand why this doesn't work:

 

proc format;
value $subA
'DL'='Deep Learning';
run;

 

Here, the 'dL' and 'Dl' are not uppercased even though when I check it, they're upper cased before the proc format. I found out that I must do this instead of using the format $upcase. :

 

data C;
set B;
subA = upcase(subA);
run;

 

I am new to SAS and would like to know what error I made

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Attaching an INFORMAT to a variable after the value is already in the variable does nothing (other then provide some documentation for future users of the dataset about how you intended to create the values in the dataset.)

 

Remember that informats convert text to values and formats convert values to text.  So unless you are doing some conversion (such as when formats are applied to the data values are they are printed) then they have no effect.

 

To use the $UPCASE informat on existing data you would need to pass the data thru the INPUT() function.

data work.C;
  set B;
  subA = input(suba,$subA.);
run;

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

I noticed that the upcase doesn't work is I put format below the rename (of course I changed the name when using format) which I resolved, but I don't understand why this doesn't work:

 

proc format;
value $subA
'DL'='Deep Learning';
run;

 

Your code never uses format $subA anywhere.

 

Here, the 'dL' and 'Dl' are not uppercased even though when I check it, they're upper cased before the proc format.

All formats, like $upcase, change the appearance of the values. They do not change the actual values. SAS always works on the actual values and not the formatted values, and so your format $subA will only work on DL, it will not work on Dl or dL, these are the actual values. So you have not told SAS to assign a format to Dl or dL. When you want to change the actual values, you need the upcase() function and not the $upcase format.

 

--
Paige Miller
nubhi
Fluorite | Level 6
Hello. Thank you for your response, I had found out that I had to use the upcase() function after setting the data to a new data by the time I posted the question so I might have forgotten the format $subA.
ballardw
Super User

You can solve several issues by use of a data step to read the file.

When you use a data step you can specify custom INFORMATS to read data, and the INVALUE statement supports the UPCASE option to convert the read text to uppercase before applying any result. Also you can set the names of the variables.

Example:

proc format library=work;
invalue $subA (upcase )
'DL' ='Deep Learning'
;
invalue $subB (upcase )
'SAS' ='SAS'
;
invalue $subc (upcase )
'BIG DATA ANALYTICS' ='BIG DATA ANALYTICS'
;
run;

data work.student;
   /* the infile would point to your file*/
   infile datalines dlm=',' firstobs=2;
   informat sys_no 8. suba $suba. subb $subb. subc $subc.;
   input sys_no suba subb subc;
datalines;
sys_no,sub1,sub2,sub3,sub4
1,DL,SAS,Big Data analytics,Stock Market Operations
3,DL,SAS,Big Data analytics,Stock Market Operations
4,DL,SAS,Big Data Analytics,Stock Market Operations
27,DL,SAS,Big data analytics,Stock Market Operations
12,DL,SAS,Big Data Analytics,Stock Market Operations
24,Dl,SAS,BIG DATA ANALYTICS,STOCK MARKET
23,dL,sas,BIG DATA ANALYTICS,STOCK MARKET
;

I did not include SubD as I was not sure whether both "Stock Market Operations" and "STOCK MARKET" should be treated the same or differently. Note that the values before the = sign in the Invalue statement are the "in the file" values but in upper case as that result will be used after the UPCASE option is applied to assign values.

 

This works best if know all the categories to be in a variable. At which point adding "other = _error_" at the end of the Invalue means you get an invalid data note in the log just like trying to read "69NOV1985" with a date9 informat.

 

PLEASE post any example text data into a text box opened on the forum with the </> icon above the message window. The forum reformats text pasted into the message window and your "csv" example has some other characters besides a space following the commas (which is poor practice in general with CSV files)

nubhi
Fluorite | Level 6
Hello. Thank you for your response, I shall try out that code when I am able to access a computer. I did find someone having the same issue, and used the informat method to upcase but did not put it in () parenthesis. I thought it was upcase () and not (upcase). Anyway, thank you.
nubhi
Fluorite | Level 6

Hello, I tried doing your code and it works! But I'm unable to do it with the data I input from the csv, could you give me some pointers as I do not know why it doesn't work (It is not being up-cased)? The only difference I'm able to find after a lot of trial and error is that I am doing the informat when parsing the code which I give through cards and not with the data I input from the csv. Here is the code:
proc format library=work;
invalue $subA (upcase )
'DL' ='Deep Learning'
;
run;

data work.C;
set B;
informat sys_no 8. subA $subA.;
run;

Tom
Super User Tom
Super User

Attaching an INFORMAT to a variable after the value is already in the variable does nothing (other then provide some documentation for future users of the dataset about how you intended to create the values in the dataset.)

 

Remember that informats convert text to values and formats convert values to text.  So unless you are doing some conversion (such as when formats are applied to the data values are they are printed) then they have no effect.

 

To use the $UPCASE informat on existing data you would need to pass the data thru the INPUT() function.

data work.C;
  set B;
  subA = input(suba,$subA.);
run;
Tom
Super User Tom
Super User

If you want the format $SUBA to display DL,dl,Dl, and dL as Deep Learning then say so when you define the format.

proc format;
value $subA
'DL','dl','dL','Dl'='Deep Learning';
run;

Otherwise you will need to first convert the values to uppercase (not just PRINT them using uppercase) so that the actual value is DL.

data B;
  set work.student;
  sub1=upcase(sub1);
  format sub1 $suba. ;
run;

 

nubhi
Fluorite | Level 6
Yes, my original intention was to make it uppercase before using proc format value. Thank you for the response.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 883 views
  • 6 likes
  • 4 in conversation