Hi guys,
Very new to sas and trying to add an additonal column based on a logic statement. Used this code
data practice.new;
if ageyr <= 8 then young="Y";
else young="N";
run;
It ended up replacing my entire column with 2 columns of ageyr and another column called young. What exactly happened? How do i make sure that everything else is not changed?
Thanks!
What entire column are you reffering to? 🙂 And can you provide some sample data?
ageyr is the column im referring to
i want to create a new column called young that is just binary for Y and N
Is that what you want:
data new;
set have;
if ageyr <= 8 then young="Y";
else young="N";
drop ageyr;
run;
@byeh2017 wrote:data practice.new;
if ageyr <= 8 then young="Y";
else young="N";
run;
data practice.new; - this statement creates a data set named 'new' in 'practice' library
if practice.new is your source data
u need to define practice.new in the SET statement so that SAS fetches data from it
Let me repeat the most important comment you received so far.
No SET statement = no incoming data
So you created PRACTICE.NEW, but based upon no incoming data (based upon only your logic statements). If you intended to make changes to PRACTICE.NEW, it is gone. You would need to re-create it.
Thanks.
If i wanted to only modify the dataset I'm referring to. Do I just do this?
data practice.new1; set practice.new1; if ageyr <= 8 then young="Y"; else young="N"; run;
@byeh2017 wrote:
Thanks.
If i wanted to only modify the dataset I'm referring to. Do I just do this?
data practice.new1; set practice.new1; if ageyr <= 8 then young="Y"; else young="N"; run;
Yes, this code will add new variable YOUNG to the dataset, with value based on AGEYR.
That would work. However, note that by running the program without the SET statement, you have already wiped out your original data set. You still need to re-create it first.
@byeh2017 wrote:
Thanks.
If i wanted to only modify the dataset I'm referring to. Do I just do this?
data practice.new1; set practice.new1; if ageyr <= 8 then young="Y"; else young="N"; run;
Yes this should work but until you become much more comfortable with SAS I would recommend creating new data sets as you do not actuall "add" the variable but recreate the entire dataset with a new variable added. The difference may look subtle but you have already likely destroyed an existing data set once and you may do so again with the same data set name on the Set and Data lines.
I have seen code where people were recoding values with something like:
if x<3 then x=x-1;
using the same input/output. They added a new variable and ran the code again. Added another code change. And then tried to figure out why they had negative values and zeroes they didn't expect.
@byeh2017, pay attention, the meaning of each line in next skilton code:
DATA libref.dsname; <<< defines output dataset to a library given by libref, dataset name given as dsname.
SET libref.dsname; <<< defines input dataset from a library given by libref, dataset name given as dsname.
... any sas code need to produce output from input ...
RUN; <<< closes the data step ready to check syntax and execute .
DATA and SET may be assigned to different datasets, either differ by libref or by dsname or by both.
In case they are both assigned to same libref and dsname, sas will create a new dataset and replace the original.
There are also other ways to define input, beside of SET statement, like INFILE to read
external (not sas) file as xxx.txt or xxx.csv or DATALINES.
Anyhow, usually you need INPUT to create OUTPUT.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.