Solved: Re: Visual Analytics: dataset processing code

ender111 · Posted 12-15-2022 06:56 PM

I am trying to process a generic data set from SAS using a data plan in Visual Analytics to avoid excess data sets in our libraries. I know there is a data step code option in the data plan builder, and I can get a data step to work, but whenever I use a proc sql or proc sort statement it fails.

I can't imagine they would build the functionality of a data step into this interface without also allowing sorts and proc sql blocks... only using data steps will not get me far.

So am I incorrect? Can you use PROC SORT and PROC SQL with a certain syntax that I am just missing the trick? I have used brackets, no brackets, and only the documented dataset aliases.

Stu_SAS · Posted 12-16-2022 10:20 AM

Hey @ender111! Can you post the error messages you are getting? If you are processing a dataset in a CASLIB that also outputs to a CASLIB, proc sort will not work as expected (unless you are deduping). CAS does not need data to be sorted in order to work with it like traditional SAS datasets do. The main reason for this is (1) due to the architecture of CAS, and (2) the parallel processing capabilities of CAS. CAS will perform sorts in the background as you ask for by-group processing. You can, however, store data in a sorted order within CAS if you partition it to reduce the amount of background sorting and data transfer that is done, but it is not necessary and is usually done for performance reasons. This is helpful if you're regularly accessing data in a sorted format (e.g. transactional by time).

To run SQL in CAS, you must use FEDSQL. PROC SQL does not run in CAS, but PROC FEDSQL does. FEDSQL is very similar to SQL, but is based on ANSI:1999 SQL.

View solution in original post

Stu_SAS · Posted 12-16-2022 10:20 AM

Hey @ender111! Can you post the error messages you are getting? If you are processing a dataset in a CASLIB that also outputs to a CASLIB, proc sort will not work as expected (unless you are deduping). CAS does not need data to be sorted in order to work with it like traditional SAS datasets do. The main reason for this is (1) due to the architecture of CAS, and (2) the parallel processing capabilities of CAS. CAS will perform sorts in the background as you ask for by-group processing. You can, however, store data in a sorted order within CAS if you partition it to reduce the amount of background sorting and data transfer that is done, but it is not necessary and is usually done for performance reasons. This is helpful if you're regularly accessing data in a sorted format (e.g. transactional by time).

To run SQL in CAS, you must use FEDSQL. PROC SQL does not run in CAS, but PROC FEDSQL does. FEDSQL is very similar to SQL, but is based on ANSI:1999 SQL.

ender111 · Posted 12-22-2022 02:18 PM

Thanks Stu! I didn't realize FEDSQL was the only option in CASL, which is unfortunate. As for the sorting, I performed the same in data step, but included the deduping in the same step. So, although it's not sorted the correct entries are removed and I resort it on the visualization side.

Visual Analytics: dataset processing code

Re: Visual Analytics: dataset processing code

Re: Visual Analytics: dataset processing code

Re: Visual Analytics: dataset processing code

SAS Innovate 2025: Register Now