Solved: Re: SAS Retain

Brijesh4sas · Posted 07-17-2021 11:29 PM

Input ID Zip;
101 400201

101 400333

102 650067

103 560103

103 560067

;

How to get output.

101 400201,400333

102 650067

103 560103,560067

PaigeMiller · Posted 07-18-2021 06:26 AM

I feel compelled to repeat advice already given.

Don't re-arrange your data like this. Your programming would be much easier if you did not re-arrange data like this. To quote @ballardw :

And just what do you expect to do with that resulting data set that can't be done with the one you have now? Anything that uses that codelist variable is going to be hard to actually work with for counting, modeling, just about any analysis or graphing?

--
Paige Miller

View solution in original post

ballardw · Posted 07-18-2021 12:44 AM

How many codes do you expect to place in a single variable? You should specify a length for the variable long enough to hold that many values plus the comma separator.

Are there any other variables to be involved in this? If so, what happens with them?

This question gets asked in one form another at least once per week.

If the data is sorted by the ID variable then something like:

data want; 
   set have;
   by id;
   length codelist $ 200.;
   retain codelist;
   if first.id then call missing (codelist);
   codelist = catx(',',codelist,zip);
  i last.id then output;
  keep id codelist;
run;

If the data is not sorted by ID then do so.

This sets the length of the combined variable to 200 characters which will hold about 28 codes. Adjust the length on the length statement to match actual need.

When you use BY group processing SAS creates automatic variables that indicate whether a particular observation is the first or last of a particular by variable grouping and is a 1/0 (true/false) value accessed by using first.variablename and last.variable name.

The code above clears the existing codelist variable when the first observation of the Id is encountered, then adds the zip to the list.

The values are written out to the data set only when the last observation for an id value is encountered.

And just what do you expect to do with that resulting data set that can't be done with the one you have now? Anything that uses that codelist variable is going to be hard to actually work with for counting, modeling, just about any analysis or graphing?

urban58 · Posted 07-18-2021 10:35 AM

So, what would be a better way to structure data like this for for counting, modeling, just about any analysis or graphing?

PaigeMiller · Posted 07-18-2021 11:17 AM

The better way is to keep the data set as it was originally presented. In other words, one observation for each value of ID and ZIP. Not two (or more) zips on one line.

NOTE: All of this depends on what the next step is, what analysis or report or chart is going to be created. Certainly, there could be very rare and unusual cases where you want two (or more) zips on one line, but even then I would be skeptical that this is really needed (as it said, it would be very rare)

--
Paige Miller

PaigeMiller · Posted 07-18-2021 06:26 AM

I feel compelled to repeat advice already given.

Don't re-arrange your data like this. Your programming would be much easier if you did not re-arrange data like this. To quote @ballardw :

And just what do you expect to do with that resulting data set that can't be done with the one you have now? Anything that uses that codelist variable is going to be hard to actually work with for counting, modeling, just about any analysis or graphing?

--
Paige Miller

SAS Retain

Re: SAS Retain

Re: SAS Retain

Re: SAS Retain

Re: SAS Retain

Re: SAS Retain

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away