BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
claremc
Obsidian | Level 7

Hello, 

I am trying to cut down the length of my variables to save disc space and the usual length statement I use does not seem to be working. 

 

Here is my code: 

 

 

data new_union;
length ss_other_specify $255. Manufacturer1 $50. Manufacturer2 $50. Vaccination_Code_Id1 $50. Vaccination_Code_Id2 $50.;
set union_combine;

/*Vax code format*/
if Vaccination_Code_Id1 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
then Vaccination_Code_Id1 = ""; 
if Vaccination_Code_Id2 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
then Vaccination_Code_Id2 = ""; 

/*Vax date format*/
Vaccination_Date1_d= mdy(substr(Vaccination_Date1,6,2), substr(Vaccination_Date1,9,2), substr(Vaccination_Date1,1,4)); 
Vaccination_Date2_d= mdy(substr(Vaccination_Date2,6,2), substr(Vaccination_Date2,9,2), substr(Vaccination_Date2,1,4)); 
drop Vaccination_Date1; 
drop Vaccination_Date2; 

format Vaccination_Date1_d mmddyy10. Vaccination_Date2_d mmddyy10.; 
run; 

data new_union; 
set new_union; 

rename Vaccination_Date1_d = Vaccination_Date1; 
rename Vaccination_Date2_d = Vaccination_Date2; 
format Vaccination_Date1 mmddyy10. Vaccination_Date2 mmddyy10.; 

/*Manufacturer reformat*/
Manufacturer1 = upcase(Manufacturer1); 
Manufacturer2 = upcase(Manufacturer2); 
if Manufacturer1 = "JOHNSON AND JOHNSON" then Manufacturer1 = "JANSSEN (JOHNSON AND JOHNSON)";
if Manufacturer1 = "PFIZER" then Manufacturer1 = "PFIZER/BIONTECH";
if Manufacturer2 = "JOHNSON AND JOHNSON" then Manufacturer2 = "JANSSEN (JOHNSON AND JOHNSON)";
if Manufacturer2 = "PFIZER" then Manufacturer2 = "PFIZER/BIONTECH";

/* if case_id in ("100734763","100748803") then Manufacturer2 = "" and Vaccination_Date2 = . and Vaccination_Code_Id2 = "";  */
/* if Manufacturer2 = "0" then Manufacturer2 = "";  */
run; 





 

After I run proc contents on the "new_union" dataset, the variable length is still $350, not $50 like I would like. I am running 2 separate data steps because I've had trouble with keeping them all in one when I use drop and rename. 

 

Thanks for any help! 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rydhm
Obsidian | Level 7

When you create a new dataset from existing dataset, then you should create new variable names. Length doesn't change on existing variables. 

 

In your case, Vaccination_Code_Id1 or 2 both are in union_combine dataset. So try creating somewhat similar but different names in length statement as well as using them in code.

View solution in original post

6 REPLIES 6
Rydhm
Obsidian | Level 7

When you create a new dataset from existing dataset, then you should create new variable names. Length doesn't change on existing variables. 

 

In your case, Vaccination_Code_Id1 or 2 both are in union_combine dataset. So try creating somewhat similar but different names in length statement as well as using them in code.

claremc
Obsidian | Level 7
Thank you! That worked
Astounding
PROC Star

Your program should have been effective at cutting down the length.  Are you sure you ran the PROC CONTENTS on the new data set, not the old?

SASJedi
SAS Super FREQ

I made some fake data like this:

 

data union_combine;
   length id 8 ss_other_specify $255. Vaccination_Date1 $10
          Manufacturer1 $350. Manufacturer2 $350. Vaccination_Code_Id1 $350. Vaccination_Code_Id2 $350.;
   retain Manufacturer1 "Big Pharma 1" Manufacturer2 "Big Pharma 2";
   do id=1 to 4;
      Vaccination_Code_Id1 =choosec(ID,"No vaccine administered", "Other", "Unknown vaccine or immune globulin"," "); 
      Vaccination_Code_Id2 =choosec(abs(5-ID),"No vaccine administered", "Other", "Unknown vaccine or immune globulin"," "); 
      Vaccination_Date1='----------';
      Vaccination_Date2='----------';
      substr(Vaccination_Date1,6,2)='07';
      substr(Vaccination_Date1,9,2)=put(ID,z2.);
      substr(Vaccination_Date1,1,4)='2021'; 
      substr(Vaccination_Date2,6,2)='07';
      substr(Vaccination_Date2,9,2)=put(ID+14,z2.);
      substr(Vaccination_Date2,1,4)='2021'; 
      Manufacturer1 = choosec(ID,"JOHNSON AND JOHNSON","PFIZER","JOHNSON AND JOHNSON","PFIZER");
      Manufacturer2 = choosec(ID,"PFIZER","JOHNSON AND JOHNSON","PFIZER","JOHNSON AND JOHNSON");
      output;
   end;
run;

 

Then I ran this single data step that did all the conversions you were doing:

 

data new_union;
   length ss_other_specify $255. Manufacturer1 $50. Manufacturer2 $50. Vaccination_Code_Id1 $50. Vaccination_Code_Id2 $50.;
   set union_combine (rename=(Vaccination_Date1 = _dt1 Vaccination_Date2 = _dt2)); 
   /*Vax code format*/
   if Vaccination_Code_Id1 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
   then Vaccination_Code_Id1 = ""; 
   if Vaccination_Code_Id2 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
   then Vaccination_Code_Id2 = ""; 

   /*Vax date format*/
   Vaccination_Date1= mdy(substr(_dt1,6,2), substr(_dt1,9,2), substr(_dt1,1,4)); 
   Vaccination_Date2= mdy(substr(_dt2,6,2), substr(_dt2,9,2), substr(_dt2,1,4)); 
   Manufacturer1 = upcase(Manufacturer1); 
   Manufacturer2 = upcase(Manufacturer2); 
   if Manufacturer1 = "JOHNSON AND JOHNSON" then Manufacturer1 = "JANSSEN (JOHNSON AND JOHNSON)";
   if Manufacturer1 = "PFIZER" then Manufacturer1 = "PFIZER/BIONTECH";
   if Manufacturer2 = "JOHNSON AND JOHNSON" then Manufacturer2 = "JANSSEN (JOHNSON AND JOHNSON)";
   if Manufacturer2 = "PFIZER" then Manufacturer2 = "PFIZER/BIONTECH";
   drop _:; 
   format Vaccination_Date1 mmddyy10. Vaccination_Date2 mmddyy10.; 
run; 

 

 

All the variable sizes I changed to $50 worked just fine:

proc contents data=new_union; 
run;
Alphabetic List of Variables and Attributes
# Variable Type Len Format
2 Manufacturer1 Char 50  
3 Manufacturer2 Char 50  
4 Vaccination_Code_Id1 Char 50  
5 Vaccination_Code_Id2 Char 50  
7 Vaccination_Date1 Num 8 MMDDYY10.
8 Vaccination_Date2 Num 8 MMDDYY10.
6 id Num 8  
1 ss_other_specify Char 255  

 

Can you share your results?

 

Check out my Jedi SAS Tricks for SAS Users
ballardw
Super User

For which specific variable? You have 4 variables where you might be setting the length to 50.

 

 

You can avoid all of the character to numeric conversion messages you get from using this code:

/*Vax date format*/
Vaccination_Date1_d= mdy(substr(Vaccination_Date1,6,2), substr(Vaccination_Date1,9,2), substr(Vaccination_Date1,1,4)); 
Vaccination_Date2_d= mdy(substr(Vaccination_Date2,6,2), substr(Vaccination_Date2,9,2), substr(Vaccination_Date2,1,4)); 

by using the proper INFORMAT on the entire string.

Data example;
   x='21020715';
   y=input(x,yymmdd8.);
   format y mmddyy10.;
run;

Or if your Vaccination_date variables are simple numbers convert to string and use informat

Data example2;
   x=21020715;
   y=input(put(x,f8.),yymmdd8.);
   format y mmddyy10.;
run;

 

Ksharp
Super User
data have;
set sashelp.class;
run;

proc sql;
alter table have
modify sex char(20),name char(10);
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2912 views
  • 3 likes
  • 6 in conversation