Then post your mock dataset in a data step for easy recreation.
The usual approach for this kind of task is to use retained variables that are set at first., but I wanted to reduce the amount of code.
Can we assume that every CASE_NO has exactly one observation with a blank TFC, and at least one observation with a nonblank TFC?
In that case, here is an approach:
data want;
set have (where=(TFC=' ')) ;
output;
do until (last.case_no) ;
set have (keep=case_no TFC DocType_A where=(TFC > ' ')) ;
output;
end;
run;
Hi Astounding,
Thanks for the reply and your suggestion. But i actually have in total of 52 variables for this table, and i only wish to copy the first 6 variable's data, so is there any other approach where i don't have to use the "keep" statement in line 5 to key in all the other 40+ variables that i want to keep?
@Ant_K wrote:
Hi Astounding,
Thanks for the reply and your suggestion. But i actually have in total of 52 variables for this table, and i only wish to copy the first 6 variable's data, so is there any other approach where i don't have to use the "keep" statement in line 5 to key in all the other 40+ variables that i want to keep?
Do it the other way around and use a drop dataset option.
Sure. Replace keep= with drop= and add the names of the 6 variables that you don't want to replace.
You can probably use the UPDATE statement to do what you want, but you will need to first blank out the values that you don't want anymore. Let's convert your second example into a dataset so we have something to program with.
data have ;
length No 8 Acc_No $8 Agent $20 Src $1
Branch_F $2 Exec $2 Trx $2 DocType_A 8 TFC $1
Case_No $8
;
infile cards dsd dlm='|' truncover ;
input no -- case_no ;
cards;
1|KU1234|Fairview Enterprise|A|KU|KU|KU|2| |ABC6345
2|KL4321|AMG Marine|X|KL|KL|KL|3|C|ABC6345
3|KL4321|AMG Marine|X|KL|KL|KL|4|C|ABC6345
4|KL4322|Macro Trust Fund |X|KL|KL|KL|5|C|ABC6345
5|SB9564|Essentials Dotcom|B|SB|SB|SB|1| |CBA2134
6|KL3214|Visual Optics|X|KL|KL|KL|7| |CBA2134
7|KL3214|Visual Optics|X|KL|KL|KL|8| |CBA2134
;
Now let's make a copy where some of the columns are set to missing when the value of TFC='C'. I will use CALL MISSING() to do this and use a position based variable list.
data for_update ;
set have ;
if tfc='C' then call missing(of Acc_no -- Trx);
run;
Now we can use this new dataset as the transaction dataset in an UPDATE statement. For the master dataset we will use an empty dataset, which we can get by using the dataset option OBS=0. Since you have multiple observations per CASE_NO we will need to add an OUTPUT statement to write an observation after each "transaction" is applied.
data want ;
update have(obs=0) for_update ;
by case_no;
output;
run;
Here are the results.
And here is variable summary from PROC COMPARE.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.