BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
renjithr
Quartz | Level 8

Hi,

 

Could you please let me know if there is any way to conditionally check the difference between rows?

 

For eg:

I have the below dataset(Have), where I need to calculate the shot differences between WHITE and BLACK(race) in corresponding states. i.e calculate the difference between WHITE and BLACK in California then Arizona..etc(dataset NEED below):

 

Dataset -Have

 

   Race  Shot_rcvd  Area

WHITE 10             CALIFORNIA

WHITE 20             ARIZONA

BLACK 21            CALIFORNIA

BALCK 5              ARIZONA

 

Dataset Need

 

 Race  Shot_rcvd  Area                     Diff_in_shots

WHITE 10             CALIFORNIA         -11

WHITE 20             ARIZONA               15

BLACK 21            CALIFORNIA           11

BALCK 5              ARIZONA                 -15

 

Please let me know your suggestions.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21
data have;
input RACE :$12. STATE :$12. vaccinated_cnt DATE :anydtdte.;
format date yymm7.;
datalines;
WHITE	CALIFORNIA	2	Sep2016
WHITE	ARIZONA	1	Sep2016
WHITE	TEXAS	50	Oct2016
BLACK	CALIFORNIA	1	Sep2016
BLACK	ARIZONA	5	Sep2016
BLACK	TEXAS	10	Oct2016
;

proc sql;
create table cnt_diff as
select
    a.state, 
    a.date, 
    a.race,
    b.race as other_race,
    a.vaccinated_cnt - b.vaccinated_cnt as vaccinated_diff
from 
    have as a inner join
    have as b on a.state=b.state and a.date=b.date and a.race ne b.race
order by state, date, race, other_race;
select * from cnt_diff;
quit;
PG

View solution in original post

9 REPLIES 9
ballardw
Super User
Are there always two white followed by two black with the same two states repeating the same order? Or do you have some of these where there are 3 or more states involved? Any with only one state?
renjithr
Quartz | Level 8

Hi,

 

No, it is not going to be the same..there are more states and the numbers of white and black are different too.For some states there will be two black and one white or one black and 4 white or 2 black and 1 white..etc.

ballardw
Super User

Your example data should include at least one case of each of the types/ combinations of things going on.

If there are dates of events to consider that may be a good idea to include.

 

If this data is the result of summarizing some data it may be better to post some examples from that data as sometimes raw data makes more sense than summarized for certain forms of manipulation.

renjithr
Quartz | Level 8

Sure, here is a snapshot of the real data:

 

RACE STATE vaccinated_cnt DATE
WHITE CALIFORNIA 2 Sep2016
WHITE ARIZONA 1 Sep2016
WHITE TEXAS 50 Oct2016
BLACK CALIFORNIA 1 Sep2016
BLACK ARIZONA 5 Sep2016
BLACK TEXAS 10 Oct2016
Reeza
Super User

Maybe I'm misunderstanding but a direct comparison doesn't make sense. Wouldn't you need rates to compare? Raw numbers don't provide the correct context. 

renjithr
Quartz | Level 8

Trying to figure out the difference in the number of shots white and blacks received in each state.

PGStats
Opal | Level 21
data have;
input RACE :$12. STATE :$12. vaccinated_cnt DATE :anydtdte.;
format date yymm7.;
datalines;
WHITE	CALIFORNIA	2	Sep2016
WHITE	ARIZONA	1	Sep2016
WHITE	TEXAS	50	Oct2016
BLACK	CALIFORNIA	1	Sep2016
BLACK	ARIZONA	5	Sep2016
BLACK	TEXAS	10	Oct2016
;

proc sql;
create table cnt_diff as
select
    a.state, 
    a.date, 
    a.race,
    b.race as other_race,
    a.vaccinated_cnt - b.vaccinated_cnt as vaccinated_diff
from 
    have as a inner join
    have as b on a.state=b.state and a.date=b.date and a.race ne b.race
order by state, date, race, other_race;
select * from cnt_diff;
quit;
PG
Ksharp
Super User
data have;
input  Race $ Shot_rcvd  Area : $20.;
cards;
WHITE 10             CALIFORNIA
WHITE 20             ARIZONA
BLACK 21            CALIFORNIA
BLACK 5              ARIZONA
;
run;
data white(index=(area)) black(index=(area));
 set have;
 if race='WHITE' then output white;
  else if race='BLACK' then output black;
run;
data want;
 merge white black(keep=shot_rcvd area rename=(shot_rcvd=_shot));
 by area;
 diff=shot_rcvd-_shot;
 output;
 diff=-diff;
 race='BLACK';
 output;
 drop _shot;
run;
renjithr
Quartz | Level 8

Thank you all for your suggestions!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1141 views
  • 0 likes
  • 5 in conversation