SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
claremc
Obsidian | Level 7

Hello, 

I am trying to cut down the length of my variables to save disc space and the usual length statement I use does not seem to be working. 

 

Here is my code: 

 

 

data new_union;
length ss_other_specify $255. Manufacturer1 $50. Manufacturer2 $50. Vaccination_Code_Id1 $50. Vaccination_Code_Id2 $50.;
set union_combine;

/*Vax code format*/
if Vaccination_Code_Id1 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
then Vaccination_Code_Id1 = ""; 
if Vaccination_Code_Id2 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
then Vaccination_Code_Id2 = ""; 

/*Vax date format*/
Vaccination_Date1_d= mdy(substr(Vaccination_Date1,6,2), substr(Vaccination_Date1,9,2), substr(Vaccination_Date1,1,4)); 
Vaccination_Date2_d= mdy(substr(Vaccination_Date2,6,2), substr(Vaccination_Date2,9,2), substr(Vaccination_Date2,1,4)); 
drop Vaccination_Date1; 
drop Vaccination_Date2; 

format Vaccination_Date1_d mmddyy10. Vaccination_Date2_d mmddyy10.; 
run; 

data new_union; 
set new_union; 

rename Vaccination_Date1_d = Vaccination_Date1; 
rename Vaccination_Date2_d = Vaccination_Date2; 
format Vaccination_Date1 mmddyy10. Vaccination_Date2 mmddyy10.; 

/*Manufacturer reformat*/
Manufacturer1 = upcase(Manufacturer1); 
Manufacturer2 = upcase(Manufacturer2); 
if Manufacturer1 = "JOHNSON AND JOHNSON" then Manufacturer1 = "JANSSEN (JOHNSON AND JOHNSON)";
if Manufacturer1 = "PFIZER" then Manufacturer1 = "PFIZER/BIONTECH";
if Manufacturer2 = "JOHNSON AND JOHNSON" then Manufacturer2 = "JANSSEN (JOHNSON AND JOHNSON)";
if Manufacturer2 = "PFIZER" then Manufacturer2 = "PFIZER/BIONTECH";

/* if case_id in ("100734763","100748803") then Manufacturer2 = "" and Vaccination_Date2 = . and Vaccination_Code_Id2 = "";  */
/* if Manufacturer2 = "0" then Manufacturer2 = "";  */
run; 





 

After I run proc contents on the "new_union" dataset, the variable length is still $350, not $50 like I would like. I am running 2 separate data steps because I've had trouble with keeping them all in one when I use drop and rename. 

 

Thanks for any help! 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Rydhm
Obsidian | Level 7

When you create a new dataset from existing dataset, then you should create new variable names. Length doesn't change on existing variables. 

 

In your case, Vaccination_Code_Id1 or 2 both are in union_combine dataset. So try creating somewhat similar but different names in length statement as well as using them in code.

View solution in original post

6 REPLIES 6
Rydhm
Obsidian | Level 7

When you create a new dataset from existing dataset, then you should create new variable names. Length doesn't change on existing variables. 

 

In your case, Vaccination_Code_Id1 or 2 both are in union_combine dataset. So try creating somewhat similar but different names in length statement as well as using them in code.

claremc
Obsidian | Level 7
Thank you! That worked
Astounding
PROC Star

Your program should have been effective at cutting down the length.  Are you sure you ran the PROC CONTENTS on the new data set, not the old?

SASJedi
Ammonite | Level 13

I made some fake data like this:

 

data union_combine;
   length id 8 ss_other_specify $255. Vaccination_Date1 $10
          Manufacturer1 $350. Manufacturer2 $350. Vaccination_Code_Id1 $350. Vaccination_Code_Id2 $350.;
   retain Manufacturer1 "Big Pharma 1" Manufacturer2 "Big Pharma 2";
   do id=1 to 4;
      Vaccination_Code_Id1 =choosec(ID,"No vaccine administered", "Other", "Unknown vaccine or immune globulin"," "); 
      Vaccination_Code_Id2 =choosec(abs(5-ID),"No vaccine administered", "Other", "Unknown vaccine or immune globulin"," "); 
      Vaccination_Date1='----------';
      Vaccination_Date2='----------';
      substr(Vaccination_Date1,6,2)='07';
      substr(Vaccination_Date1,9,2)=put(ID,z2.);
      substr(Vaccination_Date1,1,4)='2021'; 
      substr(Vaccination_Date2,6,2)='07';
      substr(Vaccination_Date2,9,2)=put(ID+14,z2.);
      substr(Vaccination_Date2,1,4)='2021'; 
      Manufacturer1 = choosec(ID,"JOHNSON AND JOHNSON","PFIZER","JOHNSON AND JOHNSON","PFIZER");
      Manufacturer2 = choosec(ID,"PFIZER","JOHNSON AND JOHNSON","PFIZER","JOHNSON AND JOHNSON");
      output;
   end;
run;

 

Then I ran this single data step that did all the conversions you were doing:

 

data new_union;
   length ss_other_specify $255. Manufacturer1 $50. Manufacturer2 $50. Vaccination_Code_Id1 $50. Vaccination_Code_Id2 $50.;
   set union_combine (rename=(Vaccination_Date1 = _dt1 Vaccination_Date2 = _dt2)); 
   /*Vax code format*/
   if Vaccination_Code_Id1 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
   then Vaccination_Code_Id1 = ""; 
   if Vaccination_Code_Id2 in ("No vaccine administered", "Other", "Unknown vaccine or immune globulin") 
   then Vaccination_Code_Id2 = ""; 

   /*Vax date format*/
   Vaccination_Date1= mdy(substr(_dt1,6,2), substr(_dt1,9,2), substr(_dt1,1,4)); 
   Vaccination_Date2= mdy(substr(_dt2,6,2), substr(_dt2,9,2), substr(_dt2,1,4)); 
   Manufacturer1 = upcase(Manufacturer1); 
   Manufacturer2 = upcase(Manufacturer2); 
   if Manufacturer1 = "JOHNSON AND JOHNSON" then Manufacturer1 = "JANSSEN (JOHNSON AND JOHNSON)";
   if Manufacturer1 = "PFIZER" then Manufacturer1 = "PFIZER/BIONTECH";
   if Manufacturer2 = "JOHNSON AND JOHNSON" then Manufacturer2 = "JANSSEN (JOHNSON AND JOHNSON)";
   if Manufacturer2 = "PFIZER" then Manufacturer2 = "PFIZER/BIONTECH";
   drop _:; 
   format Vaccination_Date1 mmddyy10. Vaccination_Date2 mmddyy10.; 
run; 

 

 

All the variable sizes I changed to $50 worked just fine:

proc contents data=new_union; 
run;
Alphabetic List of Variables and Attributes
# Variable Type Len Format
2 Manufacturer1 Char 50  
3 Manufacturer2 Char 50  
4 Vaccination_Code_Id1 Char 50  
5 Vaccination_Code_Id2 Char 50  
7 Vaccination_Date1 Num 8 MMDDYY10.
8 Vaccination_Date2 Num 8 MMDDYY10.
6 id Num 8  
1 ss_other_specify Char 255  

 

Can you share your results?

 

Check out my Jedi SAS Tricks for SAS Users
ballardw
Super User

For which specific variable? You have 4 variables where you might be setting the length to 50.

 

 

You can avoid all of the character to numeric conversion messages you get from using this code:

/*Vax date format*/
Vaccination_Date1_d= mdy(substr(Vaccination_Date1,6,2), substr(Vaccination_Date1,9,2), substr(Vaccination_Date1,1,4)); 
Vaccination_Date2_d= mdy(substr(Vaccination_Date2,6,2), substr(Vaccination_Date2,9,2), substr(Vaccination_Date2,1,4)); 

by using the proper INFORMAT on the entire string.

Data example;
   x='21020715';
   y=input(x,yymmdd8.);
   format y mmddyy10.;
run;

Or if your Vaccination_date variables are simple numbers convert to string and use informat

Data example2;
   x=21020715;
   y=input(put(x,f8.),yymmdd8.);
   format y mmddyy10.;
run;

 

Ksharp
Super User
data have;
set sashelp.class;
run;

proc sql;
alter table have
modify sex char(20),name char(10);
quit;

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4182 views
  • 3 likes
  • 6 in conversation