BookmarkSubscribeRSS Feed
David_Billa
Rhodochrosite | Level 12

Can someone of you help me to write the following First. program using proc sql? If it's not easy, I'd like to understand what we are doing in this program. First. with retain statement is bit confusing for me.

 

data FILLED_IN_CHANN; 
  set MISSING_CHANS;
  retain PT_HELD;
  by PLAN_TO descending CHANNEL;
  if missing(CHANNEL) then CHANNEL = '01';
  if first.PLAN_TO then PT_HELD = CHANNEL;
  else CHANNEL = PT_HELD;
run;
7 REPLIES 7
PaigeMiller
Diamond | Level 26

There is no equivalent of FIRST. in PROC SQL. You should do this in a DATA step.

 

RETAIN and FIRST. are separate functions, they need to be understood separately. FIRST. allows you to perform actions on the first record of a BY variable. RETAIN retains the values of variable PT_HELD and uses them in the next observation unless there is some other command that assigns a value to PT_HELD.

 

So there is only one line where a value is assigned to PT_HELD, that line is

 

  if first.PLAN_TO then PT_HELD = CHANNEL;

 

so when you are on the first record of PLAN_TO, then PT_HELD is assigned the value of CHANNEL. Otherwise, PT_HELD has the same value as on the previous record.

 

If you look at the resulting data set with your own eyes, you should see what is happening.

--
Paige Miller
ballardw
Super User

An SQL equivalent FOR THIS CASE ONLY would be something like

 

proc sql;
   create table FILLED_IN_CHANN as
   select coalescec(a.channel,b.channel,'01') as channel
         ,<here goes the SQL required verbosity of 
         listing every other variable from A that
         you want>
   from MISSING_CHANS as a
        left join
        (select distinct PLAN_TO,channel
            from MISSING_CHANS 
            where not missing(channel) ) as b
        on a.plan_to=b.plan_to
   ;
quit;

I put the CoalesceC bit first as that is the assignment of interest out of the data step. Write the select in the order you want the variables.

 

In general trying to translate code that is very order dependent, i.e. using First. and Last. and/or LAG, DIF, Retain functions does not have an easy analog in SQL because SQL seldom processes records in the order you expect and some of the functions like Lag, Dif and Retain just plain do not work in SQL.

Warning: The above code assumes that there is actually only one non-missing value for the Channel for each Plan_to. Without example data it is impossible to suggest a reliable process for the order of values of Channel as it is apparently character and the typical aggregate functions used to get the "largest" are not reliable for character values.

mkeintz
PROC Star

PROC SQL does not care about record order, but the DATA step does.   That is why there can be no equivalent to FIRST. in proc sql.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Kurt_Bremser
Super User

There's one simple answer: don't.

Sequence is the domain of the data step, while SQL does not guarantee any processing order at all. Therefore you need to create complicated SQL queries to force this, and will have, apart from ugly unmaintainable code, considerably worse performance than what you get from a data step.

Do not even think about it, especially not in the beginning of your SAS career.

Maxim 14 says to Use the Right Tool, and this is the data step here.

David_Billa
Rhodochrosite | Level 12
Got it. How the results will differ if I remove retain statement from the
code which I posted?
PaigeMiller
Diamond | Level 26

@David_Billa wrote:
Got it. How the results will differ if I remove retain statement from the
code which I posted?

You could go ahead and try it, and then you'd know the answer.

--
Paige Miller
Kurt_Bremser
Super User

See Maxim 4. Just try it.

 

Since your code is designed to carry over the greatest value to all other observations within a group, and set "01" if all values are missing, the RETAIN is essential. Without it, there won't be any carry-over.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 981 views
  • 2 likes
  • 5 in conversation