Hi all,
I have the below dataset and trying to create a new variable (Location2).
looking at every row for each person,
if there is only one row of data per person, then location2 = location
if there are multiple rows of data per person and location is consistent for each person, then location2 = location
if there are multiple rows of data per person but location is not consistent, then location2 = 'Both'
I've tried using a do-loop but it isn't quite working as I want.
Person | date | Location | Location2 |
1001 | 20190629 | DC | DC |
1007 | 20190124 | Maryland | Both |
1007 | 20191105 | DC | Both |
2364 | 20191203 | Maryland | MD |
2364 | 20190405 | Maryland | MD |
3331 | 20190101 | Maryland | MD |
6579 | 20190325 | DC | DC |
6579 | 20190404 | DC | DC |
6579 | 20190416 | DC | DC |
6579 | 20190504 | DC | DC |
6579 | 20190717 | DC | DC |
6579 | 20190807 | DC | DC |
6579 | 20191101 | DC | DC |
1034 | 20190219 | Maryland | Both |
1034 | 20191019 | DC | Both |
1035 | 20190301 | Maryland | Both |
1035 | 20191201 | DC | Both |
You did not provide any rule for converting Maryland into MD.
proc sql;
create table want as
select *
, case when (count(distinct location)<2) then location else 'Both' end as location2
from have
group by person
;
quit;
Results:
Obs Person date Location expected location2 1 1001 2019-06-29 DC DC DC 2 1007 2019-11-05 DC Both Both 3 1007 2019-01-24 Maryland Both Both 4 1034 2019-02-19 Maryland Both Both 5 1034 2019-10-19 DC Both Both 6 1035 2019-12-01 DC Both Both 7 1035 2019-03-01 Maryland Both Both 8 2364 2019-04-05 Maryland MD Maryland 9 2364 2019-12-03 Maryland MD Maryland 10 3331 2019-01-01 Maryland MD Maryland 11 6579 2019-03-25 DC DC DC 12 6579 2019-11-01 DC DC DC 13 6579 2019-08-07 DC DC DC 14 6579 2019-07-17 DC DC DC 15 6579 2019-05-04 DC DC DC 16 6579 2019-04-16 DC DC DC 17 6579 2019-04-04 DC DC DC
Can you please post your data in a data step?
You did not provide any rule for converting Maryland into MD.
proc sql;
create table want as
select *
, case when (count(distinct location)<2) then location else 'Both' end as location2
from have
group by person
;
quit;
Results:
Obs Person date Location expected location2 1 1001 2019-06-29 DC DC DC 2 1007 2019-11-05 DC Both Both 3 1007 2019-01-24 Maryland Both Both 4 1034 2019-02-19 Maryland Both Both 5 1034 2019-10-19 DC Both Both 6 1035 2019-12-01 DC Both Both 7 1035 2019-03-01 Maryland Both Both 8 2364 2019-04-05 Maryland MD Maryland 9 2364 2019-12-03 Maryland MD Maryland 10 3331 2019-01-01 Maryland MD Maryland 11 6579 2019-03-25 DC DC DC 12 6579 2019-11-01 DC DC DC 13 6579 2019-08-07 DC DC DC 14 6579 2019-07-17 DC DC DC 15 6579 2019-05-04 DC DC DC 16 6579 2019-04-16 DC DC DC 17 6579 2019-04-04 DC DC DC
https://communities.sas.com/t5/SAS-Programming/Assign-value-based-on-status/m-p/765914 similar question has been asked.
Here's the more convoluted way by storing min and max variables of each by group in new columns called min_loc and max_loc. When min_loc=max_loc, then you can set location2=location. When they're not equal, you can set it to "Both".
data have;
input person $4. location $20.;
datalines;
1001 DC
1007 Maryland
1007 DC
2364 Maryland
2364 Maryland
3331 Maryland
6579 DC
6579 DC
6579 DC
6579 DC
6579 DC
6579 DC
6579 DC
1034 Maryland
1034 DC
1035 Maryland
1035 DC
;
run;
data min;
set have;
by person;
length min_loc $20;
retain min_loc ;
if first.person then do;
min_loc='';
end;
if first.person then min_loc=location;
run;
proc sort data=min;
by person descending location;
run;
data max;
set min;
by person;
length max_loc location2 $20;
retain max_loc;
if first.person then do;
max_loc='';
end;
if first.person then max_loc=location;
if min_loc ^= max_loc then location2="Both";
if min_loc=max_loc then location2=location;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.