help needed in understanding a sas command , that looks simple

Reply
Occasional Contributor
Posts: 9

help needed in understanding a sas command , that looks simple


Can somone please help me in undrstanding fully the below sas command

 

data dups ;
set Work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code < 2 then output ;
proc print data=dups ;
run ;

 

It looks like a new data set "dups" is being created from "work.pgs" data set.

but what is happening next (by q1_code ; if first.q1_code + last.q1_code < 2 then output Smiley Wink , I do not understand at all. and also it looks like there is a problem with the command as it would not work.. and the new data "dups" will be created but with 0 observations...

 

THanks very much in advance  for your time and help. 

Respected Advisor
Posts: 4,927

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

The only case where first.q1_code + last.q1_code = 2 is when a q1_code group contains a single record. So this data step will eliminate q1_code groups with a single observation and keep the others in the new dups dataset.

PG
Occasional Contributor
Posts: 9

Re: help needed in understanding a sas command , that looks simple

Dear PG,

 

THanks so much for your kind and speedy reply. It is really helpful, I am now able to understand what they were trying to do over here. 

 

I have one more concern regarding the same command. When I run this command

(data dups ; set pbs ; by q1_code ; if first.q1_code + last.q1_code < 2 then output ; proc print data=dups ; run Smiley Wink, the new data set (dups) comes with 8 variables and zero observations. Which should not happen.

 

ALso in the command it says first.q1_code + last.q1_code <2  but not first.q1_code + last.q1_code = 2 

 

Do you kindly add any further thoughts. 

 

Respected Advisor
Posts: 4,927

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

first.q1_code + last.q1_code = 0 when observation is not the first or the last

first.q1_code + last.q1_code = 1 when observation is the first but not the last OR is the last but not the first

first.q1_code + last.q1_code = 2 when observation is the first and the last

 

those are the only possible values for first.q1_code + last.q1_code.

If your dataset work.pbs does contain q1_code groups with more than one observations and you get an empty dataset, you should check the SAS log. 

 

In fact, you should always chech the SAS log Smiley Happy

PG
Super User
Posts: 7,832

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

Supply example data that illustrates your problem. Do so in a data step.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Trusted Advisor
Posts: 1,579

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

Let us suppose your input contains the 3 lines:

   Q1_code

      a     - on this line: first.q1_code=1      last_q1_code=0

      a     - on this line: first.q1_code=0      last_q1_code=0

      a     - on this line: first.q1_code=0      last_q1_code=1

 

In case of only one single line Q1_CODE then

     a

     b       - on this line: first.q1_code=1      last_q1_code=1

     

I hope this will help you understand the case.

Occasional Contributor
Posts: 9

Re: help needed in understanding a sas command , that looks simple

Dear Advisors,

 

I have run the same sas code with slight modifications (highlighted in red) more than one times.. and here is the sas log statments that I receive for each of those command versions. Out of four, 3 times I get 0 observations in the new data set "dups". Only exception is when (first.q1_code + last.q1_code = 2 then output Smiley Wink where the new data set (dups) contains exactly the same number of observations as was in case of "work.pbs". But the program that has been handed over to me uses (first.q1_code + last.q1_code < 2 then output Smiley Wink and the whole analysis that has been done in the past by someone else is based on that, and when I try to rerun the program file to replicate th results a problem occurs and dups data comes up with 0 entry (can someone please interpret this for me). 

 

data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code < 2 then output ;

 

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 0 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 

data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code = 2 then output ;

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 3476 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 


data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code = 1 then output ;

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 0 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 

 data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code = 3 then output ;

 

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 0 observations and 8 variables.
NOTE: DATA statement used (Total process time):

Super User
Super User
Posts: 7,977

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

A bettter way of writing may be:

data dups ;
  set work.pbs;
  by q1_code;
  if first.q1_code and last.q1_code then delete;
run;

So remove records where observation is firsts and last - i.e. there is only one.

Occasional Contributor
Posts: 9

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

Dear Advisors,

 

I have run the same sas code with slight modifications (highlighted in red) more than one times.. and here is the sas log statments that I receive for each of those command versions. Out of four, 3 times I get 0 observations in the new data set "dups". Only exception is when (first.q1_code + last.q1_code = 2 then output Smiley Wink where the new data set (dups) contains exactly the same number of observations as was in case of "work.pbs". But the program that has been handed over to me uses (first.q1_code + last.q1_code < 2 then output Smiley Wink and the whole analysis that has been done in the past by someone else is based on that, and when I try to rerun the program file to replicate th results a problem occurs and dups data comes up with 0 entry (can someone please interpret this for me). 

 

data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code < 2 then output ;

 

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 0 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 

data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code = 2 then output ;

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 3476 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 


data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code = 1 then output ;

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 0 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 

 data dups ;
set work.pbs ;
by q1_code ;
if first.q1_code + last.q1_code = 3 then output ;

 

NOTE: There were 3476 observations read from the data set WORK.PBS.
NOTE: The data set WORK.DUPS has 0 observations and 8 variables.
NOTE: DATA statement used (Total process time):

 

Super User
Posts: 7,832

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

Your dataset contains only one observation per q1_code. That's it. Since first. and last. are always true (=1), the sum of both is always 2.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Respected Advisor
Posts: 4,927

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

Your second step shows that all q1_code groups contain single records. They were all first and last in their group.

PG
Occasional Contributor
Posts: 9

Re: help needed in understanding a sas command , that looks simple

Thanks so much PGStats,

 

So that simply means there are no dubplicate q1_code records?

Super User
Posts: 7,832

Re: help needed in understanding a sas command , that looks simple

Posted in reply to healtheconomist

healtheconomist wrote:

Thanks so much PGStats,

 

So that simply means there are no dubplicate q1_code records?


Aah, yes?

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 12 replies
  • 148 views
  • 5 likes
  • 5 in conversation