BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
agbpilot
Obsidian | Level 7

Hi, I'm working on a sample data set using property information.  Some of the property records have missing fields like 'prpty_city' and 'prpty_zip'.  I'm attempting to associate similar property records with those that are missing these two fields by creating a field named 'prpty_street_bracket'.  I'm also trying to use retain to bring down the 'prpty_city' and 'prpty_zip' across observations that share the same 'prpty_street_bracket'.  However, I don't know exactly how retain works with character variables.  I've successfully used retain with numeric fields but not so much with character fields.  Here's my sample code.  Any ideas/suggestions would be greatly appreciated.

 

Andy B.

 

PROC SORT DATA=COLLIN;
BY prpty_street_bracket descending prpty_city descending prpty_zip;
RUN;

DATA COLLIN;
SET COLLIN;
BY prpty_street_bracket descending prpty_city descending prpty_zip;
FORMAT PCITY $50. PZIP $5.;

RETAIN PCITY PZIP;

IF FIRST.prpty_street_bracket
THEN DO;
PCITY = prpty_city;
PZIP = prpty_zip;
END;
RUN;

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@agbpilot wrote:

Hi, thanks for your reply and suggestion.  In my process since I am re-using the same data set name "COLLIN" across multiple steps, I ended up just having to run the program in its entirety and the output is now working as desired.

 

One question about retain.  For numeric values, I typically initialize the value to zero.  Here in this example, I'm not explicitly initializing the character value.  However, the code is producing the desired outcome.  So I've left it alone..  But if I were to explicitly initialize these character variables, I guess I would want to initialize them to missing or ' '.  Is there a best practice for initializing character variables using retain?  i.e., would a character variable with a max length of $5. have to be initialized in the retain statement as '     ' ?  Or could I just define the length of this variable being retained in the format statement as $5.?

 

Andy 


Default behavior for Retain and character values is to initialize to missing.

If I have a specific value that I need in my data I initialize it in the Retain statement.

I tend to define the length explicitly before any Retain statement. The default format for any character variable defined in a Length statement would the $nn. where nn is the length so the format is generally not needed. Order of statements matters and if you retain and initialize a character variable without a length statement prior will result in a length of the initial value.  Since counting multiple spaces to initialize blanks is awkward at best I leave that to the Length statement. Plus it is easier to see from the code what the intended length was because counting spaces ....

View solution in original post

4 REPLIES 4
ballardw
Super User

Treat the value exactly the same as you would a numeric variable. Assign a value when you need it, set it missing when you don't need it.

 

Is the code you show not working somehow? If so describe what is not working, best to provide data example and possibly the log.

agbpilot
Obsidian | Level 7

Hi, thanks for your reply and suggestion.  In my process since I am re-using the same data set name "COLLIN" across multiple steps, I ended up just having to run the program in its entirety and the output is now working as desired.

 

One question about retain.  For numeric values, I typically initialize the value to zero.  Here in this example, I'm not explicitly initializing the character value.  However, the code is producing the desired outcome.  So I've left it alone..  But if I were to explicitly initialize these character variables, I guess I would want to initialize them to missing or ' '.  Is there a best practice for initializing character variables using retain?  i.e., would a character variable with a max length of $5. have to be initialized in the retain statement as '     ' ?  Or could I just define the length of this variable being retained in the format statement as $5.?

 

Andy 

ballardw
Super User

@agbpilot wrote:

Hi, thanks for your reply and suggestion.  In my process since I am re-using the same data set name "COLLIN" across multiple steps, I ended up just having to run the program in its entirety and the output is now working as desired.

 

One question about retain.  For numeric values, I typically initialize the value to zero.  Here in this example, I'm not explicitly initializing the character value.  However, the code is producing the desired outcome.  So I've left it alone..  But if I were to explicitly initialize these character variables, I guess I would want to initialize them to missing or ' '.  Is there a best practice for initializing character variables using retain?  i.e., would a character variable with a max length of $5. have to be initialized in the retain statement as '     ' ?  Or could I just define the length of this variable being retained in the format statement as $5.?

 

Andy 


Default behavior for Retain and character values is to initialize to missing.

If I have a specific value that I need in my data I initialize it in the Retain statement.

I tend to define the length explicitly before any Retain statement. The default format for any character variable defined in a Length statement would the $nn. where nn is the length so the format is generally not needed. Order of statements matters and if you retain and initialize a character variable without a length statement prior will result in a length of the initial value.  Since counting multiple spaces to initialize blanks is awkward at best I leave that to the Length statement. Plus it is easier to see from the code what the intended length was because counting spaces ....

Astounding
PROC Star

Here is a program that will collapse all data for a prpty_street_bracket into a single observation.  This may or may not be an acceptable solution, so try it and see if you like it.  For ALL variables in the data set, it locates the last nonmissing value.

 

proc sort data=collin;
   by prpty_street_bracket descending prpty_city descending prpty_zip;
run;

data maybe_i_will_like_this;
   update collin (obs=0) collin;
   by prty_street_bracket;
run;

 

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 2016 views
  • 1 like
  • 3 in conversation