BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tmdgus
Fluorite | Level 6

Hello,

I have dataset example illustrated below:

row#YEARSEXCITYSTATISTICSVALUE
12021..1500

2

2021

1110
32021211499
42021.12750
520211120
62021212750
72021.21300
82021121150
92021221150
102021.22550
112021122549
1220212220

 

'.' refers to the total

I would like to replace the value from row 3 to match with the row 1 value (500) in order to keep consistency on the total. Also, I would like to replace the value row 11 to match it with the row 10 value (550). Is there any code rather than inputting manually to replace values in this situation?

 

Thank you, 

1 ACCEPTED SOLUTION

Accepted Solutions
sbxkoenk
SAS Super FREQ

Hello @tmdgus ,

Since you have told me (in a private mail) that one of the 2 sexes always has a value of 0 if the sexes don't add up to the total, it's even easier.

Here's the code:

(the code can be more compact and elegant than below but I didn't want to make it too complicated)

PROC DATASETS library=WORK NoList;
 delete have: want: / memtype=DATA; run;
QUIT;

data have(drop=CITY STATISTICS);
LENGTH rownr YEAR 8
       SEX CITY STATISTICS $ 1 citystat $ 3
       VALUE 8;
input rownr YEAR SEX $ CITY $ STATISTICS $ VALUE;
citystat=CITY !! '|' !! STATISTICS;
if missing(sex) then sex='0';
cards;
1	2021	.	1	1	500
2   2021	1	1	1	0
3	2021	2	1	1	499
4	2021	.	1	2	750
5	2021	1	1	2	0
6	2021	2	1	2	750
7	2021	.	2	1	300
8	2021	1	2	1	150
9	2021	2	2	1	150
10	2021	.	2	2	550
11	2021	1	2	2	549
12	2021	2	2	2	0
;
run;

PROC TRANSPOSE data=have 
               out=have_trp1(drop=_NAME_)
               prefix=sex_;
 by YEAR citystat;
 id sex;
 var value;
run; 

data have_trp2(drop=deviance);
 set have_trp1;
 deviance=(sex_0 - SUM(sex_1,sex_2));
 if deviance ^= 0 then do;
  if      sex_1=0 then sex_2=(sex_2 + deviance);
  else if sex_2=0 then sex_1=(sex_1 + deviance);
  else;
                       end;
run;

data want(drop=i sex_0 sex_1 sex_2);
LENGTH YEAR 8
       SEX $ 1 citystat $ 3
       VALUE 8;
 set have_trp2;
 array sexarray{3} sex_0 sex_1 sex_2;
 do i=1 to dim(sexarray);
  sex=compress(vname(sexarray(i)),'sex_');
  if sex='0' then sex='.';
  value=sexarray(i);
  output;
 end;
run;
 /* end of program */

Koen

View solution in original post

7 REPLIES 7
sbxkoenk
SAS Super FREQ

Hello,

See below for two possible solutions.

These are, of course, 2 hard-coded solutions!!

Ideally, you would programmatically look at what the discrepancy is with the total and then proportionally increase (or decrease) the women and men until the total for both is reached. ****That is also easy to do.****

 

data have;
input rownr YEAR SEX CITY STATISTICS VALUE;
cards;
1	2021	.	.	1	500
2   2021	1	1	1	0
3	2021	2	1	1	499
4	2021	.	1	2	750
5	2021	1	1	2	0
6	2021	2	1	2	750
7	2021	.	2	1	300
8	2021	1	2	1	150
9	2021	2	2	1	150
10	2021	.	2	2	550
11	2021	1	2	2	549
12	2021	2	2	2	0
;
run;

data want1;
 set have;
 by rownr year sex;
 if      _N_=3  then value=500;
 else if _N_=11 then value=550;
 else;
run;

data want2(drop=valueretain);
 set have;
 by rownr year sex;
 retain valueretain .;
 if      _N_=1  then valueretain=value;
 else if _N_=3  then do; value=valueretain; valueretain=.; end;
 else if _N_=10 then valueretain=value;
 else if _N_=11 then do; value=valueretain; valueretain=.; end;
 else;
run;

proc compare data=want1 compare=want2;
run;
/* end of program */

 

Cheers,

Koen

tmdgus
Fluorite | Level 6

Thanks for the code. Would you share the sample of code for the last sentence you mentioned?

(Ideally, you would programmatically look at what the discrepancy is with the total and then proportionally increase (or decrease) the women and men until the total for both is reached. ****That is also easy to do.****)

 

Thank you,

sbxkoenk
SAS Super FREQ

Hello @tmdgus ,

Since you have told me (in a private mail) that one of the 2 sexes always has a value of 0 if the sexes don't add up to the total, it's even easier.

Here's the code:

(the code can be more compact and elegant than below but I didn't want to make it too complicated)

PROC DATASETS library=WORK NoList;
 delete have: want: / memtype=DATA; run;
QUIT;

data have(drop=CITY STATISTICS);
LENGTH rownr YEAR 8
       SEX CITY STATISTICS $ 1 citystat $ 3
       VALUE 8;
input rownr YEAR SEX $ CITY $ STATISTICS $ VALUE;
citystat=CITY !! '|' !! STATISTICS;
if missing(sex) then sex='0';
cards;
1	2021	.	1	1	500
2   2021	1	1	1	0
3	2021	2	1	1	499
4	2021	.	1	2	750
5	2021	1	1	2	0
6	2021	2	1	2	750
7	2021	.	2	1	300
8	2021	1	2	1	150
9	2021	2	2	1	150
10	2021	.	2	2	550
11	2021	1	2	2	549
12	2021	2	2	2	0
;
run;

PROC TRANSPOSE data=have 
               out=have_trp1(drop=_NAME_)
               prefix=sex_;
 by YEAR citystat;
 id sex;
 var value;
run; 

data have_trp2(drop=deviance);
 set have_trp1;
 deviance=(sex_0 - SUM(sex_1,sex_2));
 if deviance ^= 0 then do;
  if      sex_1=0 then sex_2=(sex_2 + deviance);
  else if sex_2=0 then sex_1=(sex_1 + deviance);
  else;
                       end;
run;

data want(drop=i sex_0 sex_1 sex_2);
LENGTH YEAR 8
       SEX $ 1 citystat $ 3
       VALUE 8;
 set have_trp2;
 array sexarray{3} sex_0 sex_1 sex_2;
 do i=1 to dim(sexarray);
  sex=compress(vname(sexarray(i)),'sex_');
  if sex='0' then sex='.';
  value=sexarray(i);
  output;
 end;
run;
 /* end of program */

Koen

tmdgus
Fluorite | Level 6
Thank you very much!
PeterClemmensen
Tourmaline | Level 20

I don't fully understand this. Why those exact observations?

sbxkoenk
SAS Super FREQ

Hello @PeterClemmensen ,

I see it like this.

For every CITY STATISTICS combination, there's a total (first row with sex=.) and a sex=1 value and a sex=2 value. The sex=1 value + sex=2 value should equal the total.

I think the CITY value for row 1 should be 1 instead of . (missing).

The original questioner can contradict me of course. Do not take my word for it in this matter. 😏😏

Cheers,

Koen

tmdgus
Fluorite | Level 6

@sbxkoenk  Yes, you are correct. The city for the first row should be 1.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 1070 views
  • 3 likes
  • 3 in conversation