BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
marysmith
Calcite | Level 5

Dear community,

 

I did 2 datasteps by creating 2 different new variables. I used the same dataset and put the new variable in the same new dataset. But either I have the first variable in this dataset or the second one. But I need both variables in one dataset.

This never happened to me before. What could be the problem? Does anybody know?

Thank you so much 🙂

data patients1; 
set datensatz_1416_Erstanzeige;
format patients $20.;
if numberofpatients = . then patients="0";
if 1<= numberofpatients <= 100 then patients="1-100";
if 101<=numberofpatients<= 500 then patients="101-500";
if 501<=numberofpatients<=1000 then patients="501-1000";
if 1001<=numberofpatients<=10000 then patients="1001-10000";
if numberofpatients >10000 then patients= ">10000";
run;

data patients1; 
set datensatz_1416_Erstanzeige;
format authorisation $8.;
if NIS_Nummer in (2628,3078,6791) then authorisation="11Jan2008";
if NIS_Nummer in (2408) then authorisation="28Aug2007";
if NIS_Nummer in (6759) then authorisation="18Sep2014";
if NIS_Nummer in (2311,5361,6887) then authorisation="23Apr2007";
if NIS_Nummer in (6687) then authorisation="03Aug2009";
if NIS_Nummer in (6667) then authorisation="27May2015";
if NIS_Nummer in (6657) then authorisation="08May2014";
if NIS_Nummer in (2075, 6756) then authorisation="26Aug2013";
run;
1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Seems pretty obvious if you look at it. You are creating database B from dataset A.  You then overwrite it with a new version of B created from the original dataset A.  If you did want to do it in two data steps then have the second step read from the output of the first one instead of going back to the original data.

 

Your Subject line is backwards. You cannot have a data step "in" a dataset.  A dataset is the output (and inputs) to a data step.

 

Why are you attaching formats to your character variables?  SAS already knows how to print character variables and does not need to have special formatting instructions attached to the variables.  Perhaps you meant to use a LENGTH statement to set the variables length before using it later in the data step?

 

There is no reason to use two data steps.  You can calculate both new variables in the same data step.

View solution in original post

6 REPLIES 6
PaigeMiller
Diamond | Level 26
data patients1; 
set datensatz_1416_Erstanzeige;
format patients $20. authorisation $8.;
if numberofpatients = . then patients="0";
if 1<= numberofpatients <= 100 then patients="1-100";
if 101<=numberofpatients<= 500 then patients="101-500";
if 501<=numberofpatients<=1000 then patients="501-1000";
if 1001<=numberofpatients<=10000 then patients="1001-10000";
if numberofpatients >10000 then patients= ">10000";

if NIS_Nummer in (2628,3078,6791) then authorisation="11Jan2008";
if NIS_Nummer in (2408) then authorisation="28Aug2007";
if NIS_Nummer in (6759) then authorisation="18Sep2014";
if NIS_Nummer in (2311,5361,6887) then authorisation="23Apr2007";
if NIS_Nummer in (6687) then authorisation="03Aug2009";
if NIS_Nummer in (6667) then authorisation="27May2015";
if NIS_Nummer in (6657) then authorisation="08May2014";
if NIS_Nummer in (2075, 6756) then authorisation="26Aug2013";
run;

Some advice ... in general this is not a good way to get groups like "1-100" for patients. Everything would work better if you used formats to group the number of patients instead of doing it as you have. In addition, if you ARE going to do it as above, use IF-THEN-ELSE instead of repeated IF-THEN.

 

Some advice part 2 ... there is usually no reason and no benefit to set dates as character strings as you are doing, such as "26AUG2013". There's no way to use this value in any comparison. Instead, you want to use SAS date values, such as 

 

if NIS_Nummer in (2075, 6756) then authorisation='26Aug2013'D;

The D at the end makes this a SAS date value, and then you can compare it to other dates and do math on it. You might also want to assign a date format to authorisation.

--
Paige Miller
marysmith
Calcite | Level 5
Thanks for your quick respond. As I am teaching sas myself I am always happy about advices.
1.How can I use formats to group the number of patients?This was the only way I knew how to do it.
2. I formated the date like you told me but now when I open the table the colums with the authorisationdate has random numbers in it 😞
Instead of 28.Aug2008 its now 17406. What does that mean?

Thank you 🙂

PeterClemmensen
Tourmaline | Level 20

First of all kudos for teaching yourself SAS! 🙂

 

1. Using formats is a very efficient way to group data. I would start with reading the User Defined Format Basics of the SAS Documentation on formats. Then work my way from there.

 

2. The numbers are not random, though they could seem so. 17406 is the number of days since the first of january 1960. That is how SAS dates are defined. 

 

In the following code, I define two variables with the exact same value of the date constant 28aug2008. I only format one of them. They appear different in the data set, though their value is exactly the same.

 

data test;
   DateNotFormatted="28aug2008"d;
   DateFormatted="28aug2008"d;
   format DateFormatted date9.;
run;

 

 

Tom
Super User Tom
Super User

SAS stores dates as the number of days since 1/1/1960.  Dates are values do need special formatting instructions attached to them so that they display in a way that humans recognize.  You could use the DATE9. or any of the many other formats that SAS has to display dates.  Make sure to define the variable as numeric instead of character.

 

Tom
Super User Tom
Super User

Seems pretty obvious if you look at it. You are creating database B from dataset A.  You then overwrite it with a new version of B created from the original dataset A.  If you did want to do it in two data steps then have the second step read from the output of the first one instead of going back to the original data.

 

Your Subject line is backwards. You cannot have a data step "in" a dataset.  A dataset is the output (and inputs) to a data step.

 

Why are you attaching formats to your character variables?  SAS already knows how to print character variables and does not need to have special formatting instructions attached to the variables.  Perhaps you meant to use a LENGTH statement to set the variables length before using it later in the data step?

 

There is no reason to use two data steps.  You can calculate both new variables in the same data step.

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1105 views
  • 0 likes
  • 4 in conversation