Hi All,
I am in need of your assistance please. I have data like this:
Chararacteristic | |
White only: | |
Below poverty | |
At or above poverty | |
Black or African American only: | |
Below poverty | |
At or above poverty | |
Hispanic or Latino: | |
Below poverty | |
At or above poverty |
But I am trying to get my data like below. Any suggestions would be greatly appreciated:
White only: | Below Poverty |
White only: | At or above poverty |
Black or African American only: | Below Poverty |
Black or African American only: | At or above poverty |
Hispanic or Latino: | Below Poverty |
Hispanic or Latino: | At or above poverty |
Hi @luvscandy27
Stealing the part of the sample code from @A_Kh, here is one way how you can get what you looking for
/* Create sample data */
data have;
infile cards dlm=',' truncover;
length Chararacteristic $100;
input Chararacteristic;
cards;
White only:,
Below poverty,
At or above poverty,
Black or African American only:,
Below poverty,
At or above poverty,
Hispanic or Latino:,
Below poverty,
At or above poverty
;
run;
/* Create a Format that covers all sub-categories of interest regardless of their main characteristics */
proc formats lib=work;
value $subs
'Below poverty' = '*'
'At or above poverty' = '*'
;
run;
data want (DROP=Chararacteristic);
length main_char $40 sub_char $20;
retain main_char;
set have;
if (PUT(Chararacteristic,$subs.)='*') then sub_char = Chararacteristic;
else main_char = Chararacteristic;
if sub_char ne ''; /* Remove header row (Main_char only) */
run;
Hope this helps
One way to do so is creating two temporary variables and mapping values from the source variable there using RETAIN;
eg;
data have;
infile cards dlm=',' truncover;
length Chararacteristic $100;
input Chararacteristic;
cards;
White only:,
Below poverty,
At or above poverty,
Black or African American only:,
Below poverty,
At or above poverty,
Hispanic or Latino:,
Below poverty,
At or above poverty
;
proc print; run;
data want;
set have;
length temp1 temp2 $100;
retain temp1;
if find(Chararacteristic, ':') gt 0 then temp1= Chararacteristic;
else temp2=Chararacteristic;
if temp2 ne '';
drop Chararacteristic;
proc print;run;
I'm sure the solution from @A_Kh works, but question for @luvscandy27: are you 100% sure that there will always be a colon at the end of the text that you want to stay in column 1 and never a colon in the text you want to move to column 2?
There will not always be a colon at the end. This is just a portion of the dataset. The dataset has several situations
like the one I posted. I was hoping if i got a solution, I would be able to modify the code and apply the code to similar situations
in the dataset.
Please explain the structure of the data to us, so we can help provide a solution. What is the proper indication in the text that indicates text that is to stay in column 1, or that indicates text that should be moved to column 2? Please provide sample data that is more realistic to your actual problem.
Below please see the a sample of the table:
Source in PDF: https://www.cdc.gov/nchs/data/hus/2020-2021/VaxCh.pdf
Source in Excel: https://ftp.cdc.gov/pub/Health_Statistics/NCHS/Publications/Health_US/hus20-21tables/VaxCh.xlsx
There is not an identification variable which complicates thing even more.
The complete table is segmented by types of vaccinations:
Combined 7-vaccine series
DT\PD\TDTap (4 doses or more)
Hepatitis A (2 doses or more)
Hepatitis B (3 doses or more)
Hib (full series)
Measles, mumps, rubella (1 dose or more)
PCV (4 doses or more)
Polio (3 doses or more)
Rotavirus vaccine
Varicella (1 dose or more)
@luvscandy27 wrote:
The complete table is segmented by types of vaccinations:
Combined 7-vaccine series
DT\PD\TDTap (4 doses or more)
Hepatitis A (2 doses or more)
Hepatitis B (3 doses or more)
Hib (full series)
Measles, mumps, rubella (1 dose or more)
PCV (4 doses or more)
Polio (3 doses or more)
Rotavirus vaccine
Varicella (1 dose or more)
I don't see how this relates to the problem you asked about. Please explain this in more detail.
Perhaps you can come up with some rules, in words, that identify what text remains in column 1 and what text should move to column 2 (and is there a column 3 here?)
Also, I think we would need to see how this full table gets re-arranged, including the birth year columns.
The more you can tell us about what you want, the better chances we will have to create such code.
Thank you. Do you know of anyway I can remove?
I removed it for you - just take slightly different screenshots in the future. Honestly, for this, I would just clean up the file manually and the excel is messy so you'd probably need to reference the PDF which is formatted correctly anyways 😞
PS. please double check I have the links correctly and the right file attached. If not, you can click the side of the post and edit it as needed.
Thank you!
That looks like a report. Do you have it as a text file you are reading in?
Do you really have it as a SAS dataset?
Or do you have some other dataset that you used to produce that report? If so what does that dataset look like. What code did you use to produce that report?
Hi @luvscandy27
Stealing the part of the sample code from @A_Kh, here is one way how you can get what you looking for
/* Create sample data */
data have;
infile cards dlm=',' truncover;
length Chararacteristic $100;
input Chararacteristic;
cards;
White only:,
Below poverty,
At or above poverty,
Black or African American only:,
Below poverty,
At or above poverty,
Hispanic or Latino:,
Below poverty,
At or above poverty
;
run;
/* Create a Format that covers all sub-categories of interest regardless of their main characteristics */
proc formats lib=work;
value $subs
'Below poverty' = '*'
'At or above poverty' = '*'
;
run;
data want (DROP=Chararacteristic);
length main_char $40 sub_char $20;
retain main_char;
set have;
if (PUT(Chararacteristic,$subs.)='*') then sub_char = Chararacteristic;
else main_char = Chararacteristic;
if sub_char ne ''; /* Remove header row (Main_char only) */
run;
Hope this helps
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.