BookmarkSubscribeRSS Feed
SP01
Obsidian | Level 7

Hello All,

I have a task to create a working population dataset. My working dataset A has row level information by age, race, sex, and county. I need to count records by these variables and then add required records from dataset B to have full population. Please offer any guidance or suggestion on how to do this. Thank you!

I am attaching the example in the spreadsheet here. 

SourcesTotal number Category
Census100Gender: Female; Age: 20; Race: W; County: Lancaster
Working dataset67Gender: Female; Age: 20; Race: W; County: Lancaster
Empty records need33Gender: Female; Age: 20; Race: W; County: Lancaster
Census95Gender: Male; Age: 57; Race: B; County: Adams
Working dataset50Gender: Male; Age: 57; Race: B; County: Adams
Empty records need45Gender: Male; Age: 57; Race: B; County: Adams
7 REPLIES 7
PaigeMiller
Diamond | Level 26

Most of us refuse to use data in Excel, because downloading Microsoft Office documents is a security risk.

 

Please provide a portion of your data as working SAS data step code (instructions), and not in other formats including Excel or screen captures.

--
Paige Miller
SP01
Obsidian | Level 7

Thank you for the response. Below is the structure I need to have from working

 

SourcesTotal number Category
Census100Gender: Female; Age: 20; Race: W; County: Lancaster
Working dataset67Gender: Female; Age: 20; Race: W; County: Lancaster
Empty records need33Gender: Female; Age: 20; Race: W; County: Lancaster
Census95Gender: Male; Age: 57; Race: B; County: Adams
Working dataset50Gender: Male; Age: 57; Race: B; County: Adams
Empty records need45Gender: Male; Age: 57; Race: B; County: Adams
PaigeMiller
Diamond | Level 26

I have a task to create a working population dataset. My working dataset A has row level information by age, race, sex, and county. I need to count records by these variables and then add required records from dataset B to have full population.

 

What do you mean by "add required records from data set B"? You don't present anything that looks like data set B. Add to what? Where do "empty records" enter into this, that isn't explained either.

 

Also, repeating my request with emphasis

Please provide a portion of your data as working SAS data step code (instructions), and not in other formats including Excel or screen captures.

 

--
Paige Miller
SP01
Obsidian | Level 7

My apologies for not being clear, Paige!

 

data HaveA
datalines ID age gender   race            ethnicity             county
                1  12    M    AmericanIndian Non-Hispanic   Lancaster
                2   26   M             UNK             Hispanic       Douglas
                3   34   M             UNK              UNK            Sarpy
                4   41    F             Asian        Non-Hispanic   Douglas

data Have B is census dataset and has complete age, race , se, county, ethnicity information.
So if the count of combination of Age = 12, Gender = M, Race = AmericanIndian, Ethnicity = Non-hispanic, county = Lancaster is 60 and for the same combination in census file (HaveB) we have 100, I need to add 40 records with this combination to balance.
Lets say I have 1.4 million of these records. I need to count these by each combination type and then fill or add empty records or records from census population with complete population information. So, I would be completing 1.4 million from HaveA to 1.9 million from HaveB by using counts from these combinations. I hope I make some sense here. Thank you again.

Sajid01
Meteorite | Level 14

Hello @SP01 
It appears you are trying to join two datasets using common variable.
Consider using data step merge using  common by variables.
Alternatively consider using Proc SQL join.

ballardw
Super User

What does data set B look like? You only provided half, it not less, of the problem description.

 

I actually don't understand what your final result is supposed to look like and/or any rules that would be used to make it.


Note: in the long run you will find any data that places more than one value into a variable is extremely awkward to work with. So those Category values should likely be split into 4 (or possibly more, hard to tell from partial examples) variable, one each for Gender, Age, Race and County.

Tom
Super User Tom
Super User

I don't understand what you mean.  Why would need to create observations?

If you have a count of 65 and another count of 100 to figure out that the difference is 35 you just need to subtract.

proc sql;
create table want as 
select category
        , a.count as CENSUS
        , b.count as PRESENT
        , a.count - b.count as NOT_PRESENT
from census a
inner join present b
on a.category = b.category
;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 7 replies
  • 922 views
  • 0 likes
  • 5 in conversation