Automating Macro Documentation Debugging define.XML Deleting Macro Vars Design and Use of Metadata
Dictionary Tables Efficient Programming "End to End" with Metadata Functions Macro Best Practices
Macro Debugging Macro Design   Measuring Efficiency Metadata 101 Process Flow Diagram
Professional Etiquette Program Comprehension Programming Style SDTM Implementation Simple Utilities
Simplicity via Obscurity Status Reports System Design Utility Primer Variable Cross-Ref
%whatChanged        

The first question from the audience after I finished my first paper was “why bother?” I didn’t take this as a good sign, but persisted nonetheless in my writing. Nearly 30 years later, I’ve presented to groups around the world and, most important, have not heard the “why bother” question any more.

The papers described below are some of my favorites and some of my more recent efforts. The narrative is more informal than an abstract, and often gives some of the background for why I felt the topic was worthwhile. Papers that also have a more formal abstract (text surrounded by a green box) are my "road show" presentations, those that I often present at local, regional, and other user group meetings.

This is a partial list of my papers. For a complete list, or to discuss presentation of one or more of these papers at your user group or company, contact me.

%whatChanged: A Tool for the Well-Behaved Macro    
View PDF (85k)    Presentation History    Return to Paper Index

The behavior of a reliable macro requires, among many other attributes, that it not leave any unintended files, reset options, and the like behind once it terminates. %whatChanged describes the motivation, logic, coding, and use of a macro that reports datasets and global macro variables present before and after a macro executes. The report helps the macro developer identify any unexpected (and, therefore, potentially harmful and disruptive) "leavings". While the example program in the paper focuses on comparing pre/post datasets and macro variables, it could be readily extended to other items of interest such as titles and footnotes, catalog entries, and the like.

The Macro Debugging Primer    
View PDF (152k)    Presentation History    Return to Paper Index

The SAS online help file contains this wonderful sentence:

Because the macro facility is such a powerful tool, it is also complex, and debugging large macro applications can be extremely time-consuming and frustrating.

How could you not want a chance to use this somewhere, even if it means writing a long paper?! Here's the abstract, reproduced from the NESUG 2009 conference program:

You built the macro and now it's broken. What do you do? Using the "usual suspects" – SYBMOLGEN, MPRINT, and %PUT statements – often helps. More frequently, however, discovering the root cause of a macro's ills requires use of these features and an array of home-grown tools.

This paper provides an overview of tools and techniques for debugging macros. It first identifies options and statements that are part of Base SAS. It then illustrates a series of powerful and easily-implemented debugging methods applicable to macros of any size or complexity. Finally, the paper moves outside the macro itself and describes several useful debugging utilities.

While the paper is not intended to be a design tutorial, it does identify some coding techniques that will help preempt errors. The paper is also mindful "we can't solve problems by using the same kind of thinking we used when we created them." That is, it touches on some of the differences that should exist between the mindsets of the initial and debugging programmers.

The paper is appropriate for anyone charged with developing, debugging, or enhancing macros. A basic knowledge of the macro language is assumed.

define.XML: Not Easy, and Not Just SAS    
View PDF (2,113k)    Presentation History    Return to Paper Index

Most pharma programmers are familiar with define.pdf, which describes raw and analysis data sets submitted to the FDA. Data submitted in the CDISC SDTM format are described by define.XML. This paper describes the XML "family" of technologies (XML, XSD, and XSL) and how they are used with SAS to create define.XML. As the title says, it's not easy, and it's not just SAS.

Metadata 101: A Beginner's Guide to Table-Driven Applications Programming    
View PDF (123k)    PPT slides (860k)    Presentation History    Return to Paper Index

This paper was designed for readers who have heard a lot of industry buzz about metadata but who are still not entirely sure exactly why they should be excited about it. The paper describes what metadata is, then presents several examples of typical pharmaceutical programming tasks, showing how efficiency and reliability are improved using metadata-driven strategies. This paper and the next two that are listed on this page are a good introduction to an exciting, ever-changing and ever-challenging topic. Another paper, to be presented at SAS Global Forum in 2008, will look at the software management challenges that emerge over time as metadata usage goes from "infancy" to "adolescence."

Abstract

Any programmer dealing with frequent changes to program specifications is some one who has to cope well in frustrating, time-consuming and error-prone challenges. Changes to report headers and footers, dataset contents, and other aspects of client deliverables have to be communicated effectively and implemented correctly.

Metadata and metadata-driven utilities are effective tools to reduce or entirely eliminate many of the programming and management problems inherent in traditional project work flows. This paper discusses the nature of metadata, some of its design criteria, and the notes the need for the all-important applications that make it usable throughout the project life cycle. It also presents a simple case history, presenting traditional, "before" code and work flow, followed by revised, metadata-driven coding. The reader should come away with an appreciation of the power of metadata-driven applications and, hopefully, ideas of how these techniques can be implemented in his/her workplace.



From CRF Data to Define.XML: Going "End to End" with Metadata    
   Co-authored with Jeff Abolafia
View PDF (1,079k)    Presentation History    Return to Paper Index

This is the follow up to "The Design and Use of Metadata: Part Fine Art, Part Black Art" (described below). The earlier paper was more theoretical, presenting the motivation for and benefits of creating metadata-driven applications in fairly abstract terms. This paper uses the theory as a starting point and discusses the use of metadata and metadata access tools throughout the life cycle of CDISC SDTM domain datasets. It describes the metadata we use at Rho, Inc., and presents programs that use the metadata, along with Log and other excerpts. Bottom line: a year after the first paper, we believe even more strongly in the power and elegance of metadata-driven applications.

Abstract

Traditionally, specifications for datasets and data displays were stored in Word or other document formats. This was great for communicating instructions to programmers, but not so great for developing robust and dynamic applications. By moving these specifications from documents into programmatically-accessible metadata, applications can be made more efficient and dynamic. Further, the metadata can be utilized for many other purposes.

This paper discusses the rationale for moving specifications from documents to metadata. It outlines the structure and content of the metadata used for a study. The paper then presents a case study that demonstrates how metadata is used to manage a clinical trials project from end to end. It shows how metadata is utilized from the time of study setup all the way through producing submission databases and associated define files. Throughout the paper, we show how well-designed and easily -accessible metadata improves work flow and project management throughout the life cycle of a project.



The Design and Use of Metadata: Part Fine Art, Part Black Art    
   Co-authored with Jeff Abolafia
View PDF (462k)    Presentation History    Return to Paper Index

This paper describes how one of my clients (Rho, Inc.) has successfully moved data and table specifications out of Word documents and into metadata tables. The tables, along with a suite of macros to access them, have provided significant improvements in time to delivery and quality of output. The paper focuses on concepts rather than code, and if nothing else is notable for the first occurrence of the word "SAS" being all the way down on page 5. The paper will be presented at SUGI 2006 and PharmaSUG 2006.

Abstract

The complexity of even small pharmaceutical projects can be daunting. Consider the deliverables: patient profiles, listings, domain and analysis data sets, Define files, tables, and figures. Even in a single study, these routinely total hundreds of files. For NDA submissions, these are but a single piece of a larger “puzzle.”

Consider as well the documentation and human resources pushing the study through its life cycle. Project managers need to monitor the completion status of the files. Statisticians and analysts have to identify data requirements and lay out “dummy” displays. Programmers have to write the programs to create the data and reports using specifications that are often, to be kind, “fluid.” Creation of high-quality output requires coordination of effort and clear and immediate communication of results. Rho has migrated much of the requisite project management and data and display specifications to carefully designed and utilized metadata. By moving items that describe data sets and displays from documents and low-level programs into data sets, we have realized significant gains in productivity and quality of output.

This paper describes the current use of metadata at Rho. It:
o Discusses the motivation for using metadata
o Describes the metadata architecture
o Identifies tools that access the tables
o Presents examples, comparing metadata and non metadata-driven programs

The paper is largely conceptual and nearly code-free. While we emphasize application development in the pharmaceutical industry, we feel the underlying concepts regarding metadata design and implementation are valid across industries.



Building the Better Macro: Best Practices for the Design of Reliable, Effective Tools    
View PDF (141k)    Presentation History    Return to Paper Index

If you've used the macro language for a while it's likely you've developed some opinions about how to and how not to write macros. This paper is one that I've wanted to write for a long time. It summarizes most of the design principles that I've used over the years, and presents examples of "good" and "bad" macro coding. Turns out it's an enjoyable paper to present, in part because you're either preaching to the choir or talking to people whose opinions are "not in agreement" with yours.

Abstract

The SAS® macro language has power and flexibility. When badly implemented, however, it demonstrates a chaos-inducing capacity unrivalled by other components of the SAS System. It can generate or supplement code for practically any type of SAS application, and is an essential part of the serious programmer's tool box. Collections of macro applications and utilities can prove invaluable to an organization wanting to routinize work flow and quickly react to new programming challenges. But the language's flexibility is also one of its implementation hazards. The syntax, while sometimes rather baroque, is reasonably straightforward and imposes relatively few spacing, documentation, and similar requirements on the programmer. In the absence of many rules imposed by the language, the result is often awkward and ineffective coding. Some amount of self- imposed structure must be used during the program design process, particularly when writing systems of interconnected applications. This paper presents a collection of macro design guidelines and coding best practices. It is written primarily for programmers who create systems of macro-based applications and utilities, but will also be useful to programmers just starting to become familiar with the language.



Controlling Macro Output or, "What Happens in the Macro, Stays in the Macro"    
View PDF (203k)    Presentation History    Return to Paper Index

When you build complex applications, you'll starting using tools to perform routine tasks. Think along the lines of counting the number of observations in a data set, making a list of data sets in a library, etc. The benefits of using these utilities are undeniable, as are, sadly, the instances of "leakage" of unwanted items from the utilities into the calling program's environment. Think here along the lines of options being reset, temporary data sets not deleted by the macro, and so on. This paper discusses some simple techniques for ensuring that a utility macro creates only the items (data sets, macro variables, etc.) that it was supposed to.

SDTM Implementation: The CRO Perspective    
   Co-authored with Jeff Abolafia, Laura Brewington, and Brooke Millman
View PDF (595k)    Presentation History    Return to Paper Index

Love them, hate them, or simply tolerate them: it's a fact of pharma life that SDTM and other standards initiatives are going to become a larger and more significant part of NDA submissions. This presentation summarizes the CDISC SDTM pilot studies undertaken by one of my clients. We emphasize how the CRO's perspective on this is necessarily different from that of a pharmaceutical company. The focus is on approach and strategy, rather than actual code. Even a cursory glance at the paper reveals that it is of a piece with the metadata paper discussed above. Repeat after me: "it's all about metadata, it's all about metadata, ..."

Welcome to the Barnyard, or Thoughts on Etiquette in Professional Venues    
View PDF (151k)    Presentation History    Return to Paper Index

This is my coming out party as a curmudgeon (ok, a published curmudgeon). Over the last couple of years, and especially in 2005, I've noticed what could only kindly be referred to as "slippage" in the level of professional etiquette. I limited my comments to conferences and email, both private and on list servers. The overall tone is positive, which is rather remarkable when I reflect on how unbelievably torqued I was when I witnessed the hammerheaded behavior addressed by the paper!


Benefits of Standards Implementation, or Seeing Light at the End of the Documentation Tunnel
Download ZIP File (321k)    Presentation History    Return to Paper Index

We start with two premises. First, programmers don't like to write documentation. Second, programmers rely on documentation. This article says, in essence, "writing standardized macro headers can be tedious, but once it's done you can use the headers as input to a document generator." The article shows how %MACRODOC reads a macro library whose members use a standard program header, then reformats the text, creating HTML that presents all headers in a single, easily-navigated format.

The ZIP file contains the article, the macro source, and a demo library to use as %MACRODOC input. The macro requires SAS Version 9 and was written to run in Windows XP. Converting it for use in other OS's and SAS versions is, as they say, "the pursuit of the motivated reader."


Meta Data + Files = Automated Status Reports
   Co-authored with George DeMuth and Michael DeSpirito
   
View PDF (331k)    Presentation History    Return to Paper Index

This paper is living proof of necessity being the mother of invention. The setting was typical of clinical trials reporting – many tables and listings, organized in similar, predictable directory structures. The client wanted to be informed on a nearly ongoing basis as to which tables were started, validated, and so on. Rather than continually update a manual report, we automated the task (and gained a lot of useful HTML insights in the process!)

Using metadata already in place for other reporting functions, we created a Web page that neatly summarized information about each table and gave the user at a glance an idea of the project’s status. An added benefit was creation of links to the various pieces – the user would not only see that Table “x” was created, but could also view the document via hyperlinks created by the HTML generating program.

The paper discusses the application’s setting, the structure of the metadata, the generalized programming required, and identifies some of the non-obvious aspects of rendering HTML with SAS.


The SAS Debugging Primer
   
View PDF (416k)    Presentation History    Return to Paper Index

This has for years been one of my favorite papers to present. Rather than take the traditional approach to debugging – “if you see ‘x’ happen, it could be caused by ‘a’, ‘b’, or ‘c’” – I took a somewhat different tack. The emphasis here is split between behavioral matters and those of syntax. The first section encourages the reader to understand his/her programming behavior, turning this insight into a debugging aid. Other analytical issues are also addressed – identifying the source of the bug, the sequence of fixes to apply, considering alternative coding methods, knowing when to walk away from the problem, and so on.

The second section of the paper enumerates some of the tools available both within and outside of SAS to assist the debugging process. System options, the SAS Log, macros, the DATA step, PROCs, and various “home grown” tools are identified. Emphasis is on describing the tools for debugging.

Abstract

Meet an accomplished SAS programmer and you meet someone who's probably learned by making (and fixing) lots of mistakes along the way. The breadth of the SAS System's target applications, the variety of its "dialects" (Base SAS, macro, SCL, IML, SQL), and the quirky procedural/non-procedural environmental mix conspire to make mastery of the SAS System a slippery slope to ascend. Debugging is the art of gracefully recovering and learning from falls during the ascent.

This paper discusses techniques for debugging SAS programs. Its purpose is two-fold. First, it provides behavioral and technical tips for fixing code (how to read error messages in the SAS Log, knowing when there is a problem with the program even if SAS says there isn't, using the DATA step debugger, identifying system options, using PROCs for data validation, using macro variables to control debugging output, etc.) The second focus of the paper is its presentation of design and coding methods that make the programming process more reliable, thus reducing the need for debugging in the first place.

The paper's target audience is relative newcomers to the SAS System. More seasoned users may find or rediscover some of the techniques and features being discussed. Emphasis is placed on Base SAS and the macro language, although the techniques themselves are applicable to SCL and other products.




Rules for Tools - The SAS Utility Primer
   
View PDF (610k)    Presentation History    Return to Paper Index

In recent years, I've discovered that my greatest strength in SAS programming lies in tool-building. It's fun to identify recurring needs (or even a one-time event that seems like it'll be recurring), tease out the commonality and patterns, and then design, build, and document a product that will save programming effort for a client.

The Utility Primer is a best practices paper. It is also one that you can only write after years of continually identifying "non-best" practices, knowing why they are not ideal, and gradually refining these thoughts into a strategy that results in solid, reliable tools. The paper is a condensed version of a section of a one-day course I offer in utility design.

Abstract

Let's start with the premise that good programmers are lazy by nature. They want to use tools such as formats and ODS for execution-time efficiency or to pretty-up our output, functions to perform calculations, and so on. Another hallmark of a good programmer is a keen eye for pattern recognition. Rather than rewrite basically the same program over and over, they identify similarities and parameterize the program, making it into a general-purpose program, a "utility."

This paper steps through the life cycle of a simple utility. It starts with "naïve" code that doesn't exploit program similarities, then illustrates how a general-purpose utility may be developed. It ends with the initial program becoming a call to a simple, powerful routine in a macro library. The transition from simple, brute-force programming into a compact, general- purpose utility isn't a random event. The last sections of the paper present a set of design principles for utilities.

Although we focus on Base SAS in Version 9.0, the principles and techniques are readily extended across SAS versions and products. The reader will come away from this paper with an appreciation of both the process and the tool set required to build generalized programs.




Efficient SAS Coding Techniques: Measuring the Effectiveness of Intuitive Approaches
   
View PDF (1,463k)    Presentation History    Return to Paper Index

I worked in New Zealand in the early 1980’s, building a financial reporting system for the Bank of New Zealand. We had millions of transactions to summarize, and had to do it on a mainframe that could best be described as “teeny tiny.” Things that we take for granted today – practically unlimited disk space and memory – were scarce back then. Every variable length had to be considered carefully. Every pass through the data had to be rationalized.

This paper takes an empirical approach to efficiency. It focuses on conservation of the machine resource (more on this later, in the next paper), and measures the impact of the technique compared to a program using standard, “naïve” solutions. To my knowledge, it was the first SAS paper that ever did this. It was presented at the SAS Users of New Zealand 1984 conference.


The How and When of Efficient Programming
   
View PDF (289k)    Presentation History    Return to Paper Index

This paper was written nearly 20 years after Efficient SAS Coding Techniques. I certainly didn’t give up on the idea of efficiency being important, but I did start to place a higher value on my own time. The result is a discussion of effective coding techniques, where I recognize the importance of conserving the programmer resource as well as the hardware usage. The idea here is for the reader to recognize that a hard-core, intricate programming solution to a problem may cost more in the scheme of things than a simpler, straightforward solution. This will happen if the cost of the programmer’s time to debug or modify the code becomes prohibitive.

It’s all relative, of course, and even without the effective-efficient ideology, which can become near-religious in some groups, the paper is a good treatment of how to reduce the executable load that a programmer places on SAS.

Abstract

It's relatively easy to write programs that optimize the use of CPU and other machine resources. There is a large and continually growing body of literature on the subject. What isn't as straightforward is knowing when to employ the techniques - blind implementation of tuning techniques is often not required by the task at hand and can sometimes even be counterproductive.

This paper addresses both the "how to" and "when to" aspects of writing efficient programs. It describes design and coding techniques that conserve hardware resource usage. It also identifies other, non-machine implications of their usage that could dissuade the programmer from their use. For example, using temporary array elements is more efficient than using named elements but has the documented-but-obscure behavior of retaining values across observations. Maintenance of such code by other than "seasoned" and up to date programmers can be unexpectedly problematic.

The concept of efficiency used in the paper includes all aspects of the program life cycle. We apply the "how and when" question to system design issues, system startup, DATA steps, procedures, and macros. Emphasis is on Base SAS software. The reader should finish the paper comfortable with the idea that the "best" program is not always the one that minimizes hardware resources.




The Design and Coding of Simple Utility Macros
   
View PDF (134k)    Presentation History    Return to Paper Index

A SAS programmer with even a modest amount of experience will eventually have one or both of the following internal dialogues. "Why didn't they provide a [bizzare and somewhat convoluted operation] function in Version 9? Maybe I should have sent it in as a SASware ballot item …" Here's another perrenial favorite: "Why on earth does %put _all_ display variables in such a peculiar and unreadable format?" Both are good questions, and in both cases it's probably better to create your own solution than search for an answer.

This article originally appeared in the Spring 2004 SESUG Informant. It describes two utility macros that address some simple needs. The first, QuoteList, takes a macro variable with one or more unquoted tokens and returns a variable with the tokens quoted and optionally upper-cased. The second macro, AllMacVars, prints the beginning of each global macro variable, listing them in alphabetical order. The intent of the article was twofold. First, the code is intended to be useful in and of itself. Second, as the code is explained, we demonstrate some underlying good programming practices


Understanding and Using Functions
   
View PDF (280k)    Presentation History    Return to Paper Index

This was presented as part of the Southern SAS Users 2001 conference, in the “Intro to SAS” section. Since it was one of nearly a dozen papers with a common theme, the paper is devoid of context. Still, it stands on its own pretty well, and can be used as a broad-ranging, basic introduction to using functions.


The Standalone Program Grows Up: Strategies for System Design
   
View PDF (145k)    Presentation History    Return to Paper Index

Most of us, if we're lucky, get our SAS "feet" wet by writing programs that are self-contained. That is, the program doesn't have to run before or after other programs, and it references the outside world only to access data and use autocall macros. Eventually, however, the demands of an application require making the transition to a system of programs. This is where things get interesting.

This paper, first presented in 1994, uses a small but realistic case study to illustrate the transition from a single program to a robust, and more complicated, system. It describes tools and coding conventions that make the transition painless and outlines ways to package the system so that users can't inadvertantly alter the programs. Although the paper is a decade old, it holds up well - I drew from it and other papers mentioned on this page for my SAS Utilities course (to be offered at SESUG 2004).


Using the Process Flow Diagram Object to Communicate Information
   
View PDF (178k)    Presentation History    Return to Paper Index

This is one of my infrequent forays into the Applications sections of the conference circuit. The client needed a visual and interactive interface to performance and quality control data on machines in its manufacturing facility. The gives some background to the problem and discusses features that made it simple (the number of machines being monitored was constant) or difficult (the data had to be presented in varying time intervals, and color-coded).

The “non-obvious” solution, but one that was perfectly legitimate, was using the Process Flow Diagram in Version 6.10 Screen Control Language. What the program amounted to was, in effect, carefully defining a series of rectangles in the object. Each rectangle represented a time interval, and was hyperlinked to a display of data from several production tables.


Program Comprehension: A Strategy for the Bewildered
   
View PDF (250k)    Presentation History    Return to Paper Index

I’ll bet we’ve all been in this situation at one time or another in our professional lives: you start a new job or project and are dumped into the middle of a mass (swamp?) of data and programs, then told to create new report X “that looks almost like report Y.” There are no formal specifications, the flow of program execution for Report Y is not really obvious, and pretty soon you feel overwhelmed.

What to do? How do you begin to understand the program, the data, and the system in which they are embedded? This paper leads the reader down at least part of the Road to Comprehension. It identifies resources to look for, encourages examination of learning behavior, and lists aspects of the work and programming environments that affect the way information is transmitted to the programmer. The last section of the paper describes different types of programming activity – debugging, maintenance, and enhancements – and shows how what you’re doing will affect how you acquire the information you need to be effective.

Abstract

Here is a not uncommon scenario in many workplaces. A neophyte SAS programmer is assigned to maintain, debug, or enhance an application. The atmosphere is sink or swim, the system is complex, the code is sophisticated, the documentation is scant, and the programmer is bewildered. Questions slowly take shape. "What, exactly, am I supposed to do?" "What part(s) of the application need my attention?" "Will a change to program X affect program Y?" And, most critically, "where do I start?"

What the poor programmer needs is a strategy for comprehending the program, then finding the "sweet spots" in the code as efficiently as possible. This paper presents a generalized approach for programmers, particularly SAS "newbies", to develop an understanding of how applications work. It also shows how to translate this comprehension into effective coding. The paper identifies and discusses the rationale for questions the programmer should ask about: task definition, program-level code, supporting code, system design and specification documents, and required domain knowledge.

Beginning SAS programmers should come away with a better understanding of how to correctly frame the programming problem and effectively gather the resources needed to obtain a solution. They will also come to believe that the coding of, say, a DATA step is usually simple, but the real art of programming is learning what to code, and why.




Removing Macro Variables from the SAS Environment
   
View PDF (176k)    Presentation History    Return to Paper Index

“They” said it couldn’t be done and it seemed like “they” were right. If you didn’t really nose around “under the covers” of SAS’s catalog and file structures, you could not delete a macro variable in Version 7 or earlier of SAS software. You could set it to null, but you could not remove it from the macro variable table.

This paper describes a reasonably simple way to actually delete macro variables. It adjusts the memory allocated to macro variables, then operates on the alternate locations used by SAS when the enforced memory shortage is in effect.

The exercise is now purely an academic one, of course, since the %SYMDEL command in Version 8 and later will actually remove a variable. The paper remains interesting, though, because it shows how a little digging into SAS internals can produce positive results. It’s also one of those “impress your co-workers” kinds of things …


The Elements of SAS Programming Style
   
View PDF (212k)    Presentation History    Return to Paper Index

Many years ago (1988?), someone posted a seemingly innocent question to the SAS list server (SAS-L@uga.edu). It went something like “I am a 3GL programmer and new to SAS. Is there a SAS programming style reference?” The result, I believe, was one of the best-quality sustained exchanges ever seen on the list. There were differences of opinion, to be sure, but there was a general convergence of opinion about what constituted “good” programming style.

This paper is a synthesis of the original discussion, with my “humble but correct” opinions added. I first presented the paper in 1990, and nearly 30 times since then. It’s constantly changing, in part due to input from readers and in part due to my experience and occasional change of heart about a topic.

Aside from being a user group presentation cottage industry, it is also the basis for my next book. It will flesh out some of the examples that were necessarily glossed over in the paper. My intent is to present it as a “best practices” book for people new to SAS. The realistic completion date is sometime in 2004.

Abstract

The generalized nature of SAS software almost guarantees that "n" users will develop "n" unique solutions to even basic tasks. The gap between the task correctly performed by the programs and the disparate code is, for the most part, due to programming style. This presentation discusses a set of generalized programming style guidelines useful to both experienced and novice SAS programmers.

It first investigates general principles of program design, those aspects of the analysis and coding process common to all aspects of SAS programming. The next sections focus on coding guidelines for the DATA step and procedures. Finally, debugging techniques are addressed.

The presentation contends that "good" programming style usually results in programs that are more effective in terms of both human and machine resources. The intent is not to pronounce one style good and another lacking, but to simply outline an experienced user's guidelines and gently prod other users to examine their programming habits. These habits will become critical to the success of organizations as SAS software becomes embedded in more environments and organizations.




Dictionary Tables and Views: Essential Tools for Serious Applications
   Co-authored with Jeff Abolafia
   
View PDF (245k)    Presentation History    Return to Paper Index

It’s hard to think of a Base SAS feature that has a wider range of potential uses than dictionary tables and views. They are automatically created when a SAS session begins and are continually updated as options are set, macro variables are defined, datasets are created, and so on. This wealth of information is not that hard to access, but is also not very thoroughly documented in SAS-supplied documentation. This paper, a near-total rewrite of an earlier paper I wrote with Nancy Michal, discusses the tables, describes their structures, outlines some of the “gotchas” and subtleties of their usage, and presents numerous practical examples based on “real world” applications. We also emphasize the importance of understanding SQL, the most effective tool to handle the tables. In particular, we present many examples that employ SQL’s macro language interface.

The “Essential Tools” subtitle of the paper is not hyperbole. Some of the information in the tables is simply not accessible by any other means. Any one who wants to write robust, serious utilities or generalized programs needs to have a firm grasp of the tables’ contents. The next paper fills a small, well-defined need using the tables.

Abstract

Dictionary tables were introduced to the SAS System in during the mid-life of Version 6. Laden with information that is often difficult, and sometimes impossible, to get through other means, they still appear to be on the outside of many programmers' Bag of Tricks. This is both perplexing and unfortunate for as we will see in this paper, once their content and organization is understood, they are readily adapted for a range of applications that "are only limited by your imagination." Indeed, it is difficult to think of a robust, generalized system utility that would not benefit from use of this metadata.

This paper describes dictionary tables and their associated SASHELP library views. It:

  • presents scenarios that show how they can be used
  • gives high-level descriptions of some of the more important (a relative term, to be sure) tables
  • identifies features of SQL and the macro language that are commonly used when writing programs that effectively use the tables
  • shows examples of the tables' use, emphasizing the use of SQL and the macro language interface

The reader should come away from the discussion with an understanding of the tables as well as with a checklist of SQL and macro skills that are required to use the tables most effectively.




Variable Cross-Referencing Macros – Tools for When Base SAS Isn’t Enough
   
View PDF (423k)    Presentation History    Return to Paper Index

This neat little utility makes extensive use of the dictionary tables discussed just above. The need arose during a project where data for multiple studies was coming from different sources, and thus had different names and/or attributes for similar variables (e.g., SEX versus GENDER, 1 / 2 coding versus ‘M’ / ‘F’). There are lots of tools in SAS to describe individual data sets (the CONTENTS procedure, the COLUMNS and TABLES dictionary tables, to name just two). Out-of-the-box solutions dwindle rapidly, however, when you want to easily compare study x’s PATIENT data set attributes with PATIENT in study ‘y’. It becomes time to write your own utility.

The macro processes data from the COLUMNS dictionary table and produces a clean, readable display of the data set comparisons. The user can control the output, limiting it to only those variables with different attributes in every data set, only similar attributes, only those that are in each of the data sets specified, and so on. Annotated source code is provided.


Simplicity Through Obscurity: Some Tips To Simplify Your Programming Life
   
View PDF (182k)    Presentation History    Return to Paper Index

I couldn't say this article was planned for years and crafted over time. During a Q & A session at the DC SAS Users Group meeting in September 2004 someone asked a question about being able to programmatically identify the name of the currently executing program. In a class I taught the previous week I madly scribbled a debugging technique on a white board and thought "not bad. I should write that down some day." The two events were loosely coupled, to say the least, but they had enough in common to commit to this article. The first paragraph follows:

We are always looking for ways to simplify programs. Macro utility libraries, formats, templates, and the like are all good ways to accomplish this. Another approach to code reduction is harvesting the randomly acquired and randomly filed syntax minutia that most of us acquire over the years. With this in mind, this article addresses several typical and recurring needs: being able to identify the name of the currently executing program, and having a way to easily turn groups of statements on and off. The solutions utilize arcane items such as the EXTFILES dictionary table, the reserved FILEREF named #LN00006, and the RUN statement's CANCEL option. It also reminds us that the macro language can insert even the smallest piece of code to SAS for execution, and can do so within a program statement.