BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
fjc52
Calcite | Level 5

This is my first post so bear with me please. I am not sure if I defined the subject correctly but I will do my best to explain my issue here. Essentially, I have a dataset containing Emergency Call Center data from the pandemic that I imported from Excel and I am creating new datasets (Multiple in case I mess up on one and have one to go back to) and creating new variables that I am assigning values to for each call. My goal was to create new categories to assign to each call to be able to identify the purpose of each call. Some calls have only 1 call purpose and some have as much as 3. These new categories are defined as the variables "Purpcall_newcat1", "Purpcall_newcat2", and "Purpcall_newcat3". The original category is defined in the variable "Purpcall_categ". 

Here is a sample data-step for what I am doing:

data work.mlv4;
 set work.mlv3;
 if (Purpcall_categ = "exposure guidance clarification"
 or Purpcall_categ = "Acute care and patient transfer guidance"
 or Purpcall_categ = "Asking if she can stay open"
 or Purpcall_categ = "Assistance finding DOH document regarding guidance for health care workers and first responders"
 or Purpcall_categ = "Blood and organ donor guidance"
 or Purpcall_categ = "Blood donation guidance"
 or Purpcall_categ = "COVID-19 exposure reporting guidance"
 or Purpcall_categ = "COVID-19 regulation guidance"
 or Purpcall_categ = "COVID-19 safety protocol guidance"
 or Purpcall_categ = "Can business stay open"
 or Purpcall_categ = "Childcare guidance"
 or Purpcall_categ = "Cleaning Guidance"
 or Purpcall_categ = "Closing requirements")
then Purpcall_newcat1 = "Covid Related Guidance/Inquiries";
run;

 The actual data-step is much longer due to their being many "purpcall_categ" 's that I have assigned the "Covid Related Guidance/Inquiries" value to for the variable "Purpcall_newcat1". For clarity I made it much shorter to read. When I run this code I get no errors whatsoever and SAS creates my new dataset with out any issues from a syntax standpoint.

Here is a log for example of that sample code:

 

1227  data work.mlv4;
1228   set work.mlv3;
1229   if (Purpcall_categ = "exposure guidance clarification"
1230   or Purpcall_categ = "Acute care and patient transfer guidance"
1231   or Purpcall_categ = "Asking if she can stay open"
1232   or Purpcall_categ = "Assistance finding DOH document regarding guidance for health care
1232! workers and first responders"
1233   or Purpcall_categ = "Blood and organ donor guidance"
1234   or Purpcall_categ = "Blood donation guidance"
1235   or Purpcall_categ = "COVID-19 exposure reporting guidance"
1236   or Purpcall_categ = "COVID-19 regulation guidance"
1237   or Purpcall_categ = "COVID-19 safety protocol guidance"
1238   or Purpcall_categ = "Can business stay open"
1239   or Purpcall_categ = "Childcare guidance"
1240   or Purpcall_categ = "Cleaning Guidance"
1241   or Purpcall_categ = "Closing requirements")
1548  then Purpcall_newcat1 = "Covid Related Guidance/Inquiries";
1549  run;

NOTE: There were 2545 observations read from the data set WORK.MLV3.
NOTE: The data set WORK.MLV4 has 2545 observations and 17 variables.
NOTE: DATA statement used (Total process time):
      real time           0.16 seconds
      cpu time            0.14 seconds

The log jumps from line 1241 to line 1548 because I removed the rest of the code that I commented out to remove clutter and to make the sample code shorter.

This is where my issues begin. It's a little weird and hard to explain so bear with me again. The first time I ran this code, SAS didn't assign the value that I had specified in my code, which I had specified to be "Covid Related Guidance/Inquiries" as you can see written in the code. Instead SAS cut it off and my table had the value as "Covid Related Guidance/Inq". It cut off the rest of the word "Inquiriy" I found this strange and did a google search and saw that SAS had character limits for "User-Supplied SAS Names" which was the exact wording. I didn't think much of it since "Inq" could be assumed as inquiries and could be mentioned in a disclaimer in the literature. I eventually got to the second categories for the calls that had some and had the same type of code written and the log also gave no errors. To save space I won't post this code as it is the same as the last but at the end I assign a value to "Purpcall_newcat2". This time, when assigning "Covid Related Guidance/Inquiries" as the value for the calls I specified this for category 2, SAS cut off the value to the "Co" in "Covid", so the value in my table was just the letters "Co". 

I thought that this must be a bug of some sort and saved and closed SAS and re-opened it to run the code again to see if that fixed the issue. This time when running the first sample code it instead cut "Covid Related Guidance/Inquiries" to "Covid Related" so the table now reads a completely different value for the same code that I ran last time. I have no idea why this is the case. I will post a screenshot of my table that has the wrongly assigned variable I talked about just previously (My value being cut to "Covid Related")

Screenshot:

fjc52_0-1628754164762.png

I know it's small but I really didn't want to take up any more space and make this even longer than it is. You can even see for another call that it has the value "Case/Contact" which I assigned the value "Case/Contact Inquiries" not "Case/Contact", this is also cut off.

 

I know this is a VERY long post and I apologize for that, but I had no idea how else to word this. If anyone has any questions for clarity or anything else please let me know!

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hello @fjc52 and welcome to the SAS Support Communities!

 

The most common reason why values in a character variable are truncated is an insufficient length of the variable.

 

Assuming that variable Purpcall_newcat1 in your first DATA step is not contained in dataset work.mlv3, the value "Covid Related Guidance/Inquiries" in the (first) assignment statement will determine the length of Purpcall_newcat1 (here: 32). Subsequent assignment statements with longer values do not increase that length.

 

Example:

data test;
if 0 then a='No';
a='Yes'; /* This is going to be truncated to 'Ye'! */
run;

The compiler sees a='No' as the first assignment for variable a and therefore defines it with length 2, even though only the second assignment statement is executed (because the IF condition is not met; 0 is FALSE) and it would require a length >=3 to avoid truncation.

 

Use a LENGTH statement before all assignment statements to define a length for a newly created character variable that is sufficient to accommodate the longest value you want to assign.

 

Example:

data work.mlv4;
set work.mlv3;
length Purpcall_newcat1 $60
       Purpcall_newcat2 
       Purpcall_newcat3 $100 
       Purpcall_newcat4 $80;
if ...
...
run;

This defines length 60 for Purpcall_newcat1, length 100 for both Purpcall_newcat2 and Purpcall_newcat3 and length 80 for Purpcall_newcat4.

 

To simplify an IF condition of the form var=value1 or var=value2 or ... you can use the IN operator:

if Purpcall_categ in ("exposure guidance clarification"
                      "Acute care and patient transfer guidance"
                      ...)
then ...;

 


@fjc52 wrote:

I found this strange and did a google search and saw that SAS had character limits for "User-Supplied SAS Names"


The limits for names are different from the limits for values. The maximum length of a character variable is 32767 bytes, whereas the variable name is limited to 32 bytes.

 

Please check if suitable LENGTH statements resolve your truncation issues.

 

Edit: Instead of IF-THEN/ELSE statements you can also use a user-defined format to assign values to categories.Then you may not even need new variables like Purpcall_newcat1, etc.

View solution in original post

2 REPLIES 2
FreelanceReinh
Jade | Level 19

Hello @fjc52 and welcome to the SAS Support Communities!

 

The most common reason why values in a character variable are truncated is an insufficient length of the variable.

 

Assuming that variable Purpcall_newcat1 in your first DATA step is not contained in dataset work.mlv3, the value "Covid Related Guidance/Inquiries" in the (first) assignment statement will determine the length of Purpcall_newcat1 (here: 32). Subsequent assignment statements with longer values do not increase that length.

 

Example:

data test;
if 0 then a='No';
a='Yes'; /* This is going to be truncated to 'Ye'! */
run;

The compiler sees a='No' as the first assignment for variable a and therefore defines it with length 2, even though only the second assignment statement is executed (because the IF condition is not met; 0 is FALSE) and it would require a length >=3 to avoid truncation.

 

Use a LENGTH statement before all assignment statements to define a length for a newly created character variable that is sufficient to accommodate the longest value you want to assign.

 

Example:

data work.mlv4;
set work.mlv3;
length Purpcall_newcat1 $60
       Purpcall_newcat2 
       Purpcall_newcat3 $100 
       Purpcall_newcat4 $80;
if ...
...
run;

This defines length 60 for Purpcall_newcat1, length 100 for both Purpcall_newcat2 and Purpcall_newcat3 and length 80 for Purpcall_newcat4.

 

To simplify an IF condition of the form var=value1 or var=value2 or ... you can use the IN operator:

if Purpcall_categ in ("exposure guidance clarification"
                      "Acute care and patient transfer guidance"
                      ...)
then ...;

 


@fjc52 wrote:

I found this strange and did a google search and saw that SAS had character limits for "User-Supplied SAS Names"


The limits for names are different from the limits for values. The maximum length of a character variable is 32767 bytes, whereas the variable name is limited to 32 bytes.

 

Please check if suitable LENGTH statements resolve your truncation issues.

 

Edit: Instead of IF-THEN/ELSE statements you can also use a user-defined format to assign values to categories.Then you may not even need new variables like Purpcall_newcat1, etc.

fjc52
Calcite | Level 5

Thank you for the warm welcome! I tried this code:

data work.mlv4;
set work.mlv3;
length Purpcall_newcat1 $60
       Purpcall_newcat2 
       Purpcall_newcat3 $100 
       Purpcall_newcat4 $80;
if ...
...
run;

However, the SAS log says:

12   data work.mlv2;
13    set work.ml;
14    length Purpcall_newcat1 $60
15          Purpcall_newcat2
16          Purpcall_newcat3 $100
WARNING: Length of character variable Purpcall_newcat1 has already been set.
         Use the LENGTH statement as the very first statement in the DATA STEP to declare the
         length of a character variable.
17          Purpcall_newcat4 $80;
WARNING: Length of character variable Purpcall_newcat2 has already been set.
         Use the LENGTH statement as the very first statement in the DATA STEP to declare the
         length of a character variable.
WARNING: Length of character variable Purpcall_newcat3 has already been set.
         Use the LENGTH statement as the very first statement in the DATA STEP to declare the
         length of a character variable.
18    if (Purpcall_categ = "CDRSS inquiries"
19    or Purpcall_categ = "CDRSS access and contact tracing guidance"
20    or Purpcall_categ = "CDRSS access issues; case reporting"
21    or Purpcall_categ = "CDRSS data disprepancy issues; quarantine guidance"
22    or Purpcall_categ = "LTCF asking to reduce submissions of line lists and testing"
23    or Purpcall_categ = "Reporting death, CDRSS data entry issues"
24    or Purpcall_categ = "asking to speak to LTC staff regarding line list")
25
26   then Purpcall_newcat1 = "CDRSS Inquiry";
27   run;

I made sure to do it in my very first data step since in the first one, my added variables (Purpcall_newcat1,2, & 3) were not present in work.ml. So I am unsure why it is saying the length of character variable has been set already. Could it be something I can put in my proc import statement? I looked it up and saw that putting "guessingrows=x" can be a fix to avoid truncation but everywhere I put it I get an error.

Here is my import statement without the guessingrows statement:

PROC IMPORT OUT= WORK.ml 
            DATAFILE= "C:\Users\Frank\Desktop\NJDOH Projects\ECC Data\EC
C_July8.xls" 
            DBMS=EXCEL REPLACE;
     RANGE="ECC_July8$"; 
     GETNAMES=YES;
     MIXED=NO;
     SCANTEXT=YES;
     USEDATE=YES;
     SCANTIME=YES;
RUN;

Do you know where I can put it in here to not give me an error if this could fix it?

 

Edit: So I have sort have fixed the issue a little. Turns out that for some reason I had the variables "Purpcall_newcat1, 2, & 3 in my original Excel spreadsheet. I must have added it in and forgotten to take it out. After taking them out, the code that you gave me worked and gave me no errors. It also says that the lengths are all defined as 600 (I ended up making the number 600 because of this issue I am having now). The new issue I am having now is that in Purpcall_newcat1 "Covid Related Guidance/Inquries" is now getting cut off after the "q" in inquiries (which was the first issue I described in my post) However, in Purpcall_newcat2, "Covid Related Guidance/Inquires" is not being cut off at all, despite having the same length as "Purpcall_newcat1". Any idea why that is?

 

Edit 2: I am so sorry, your solution did work. I completely forgot again that I had purposefully assigned the value "Covid Related Guidance/Inq" so if someone went to look at the code, they wouldn't be confused with the value being cut off. Thank you so much for your help! I would have never thought of doing your solution!

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 3111 views
  • 2 likes
  • 2 in conversation