Recent Projects and Activities

This page identifies some key and/or interesting activities and projects undertaken in the last four years. The descriptions are brief, and are sometimes deliberately vague so as not to violate nondisclosure agreements. In all the activities described below, the overriding goal was to produce reliable products characterized by clean design with attention to usability.

If your design or programming needs resemble anything you see here, or if you'd just like to talk about these projects, please contact CodeCrafters.

Pharmaceutical Applications

NDAs

File Export.  Wrote a generalized utility to create data sets compliant with FDA requirements. This included checks for user-written formats, correct variable names and lengths, and the like. Output data sets were split when necessary, compliant with FDA restrictions on individual file sizes. Data set size compliance issues required writing utilities to rebuild the data sets once they resided on the target system (i.e., stacking the pieces that were created when exporting).

Define File Generator.  Wrote macros to create documentation for exported files. These "define" files were compliant with FDA requirements, and were controlled by metadata (described below) and the contents of the exported transport files. Parameters to the macro control: page orientation; font; point size; file type (PDF or RTF); insertion of Draft/Final language; and insertion of hyperlinks to related documents.

NDA Analysis Data Set Builder.  These programs combine study-level analysis data sets into an integrated data set suitable for ISS and ISE table processing. They added global variables not collected at the study level, and performed QC checks to ensure that all studies contributed the expected number of variables and observations.

Metadata Design.  Created metadata containing key items for the submission: file and variable names, characteristics, and usage. This data was the core of all processes related to TFL generation and the file export and Define file activities noted above.

Metadata, Other QC Checks.  These programs reconciled expected data characteristics with those found in actual data, comparing within and across studies. Discrepancies were identified in an output PDF document whose table cells were highlighted and color-coded to quickly draw attention to problematic items.  Wrote program that read the PDF containing annotated CRF and compared its destinations to those expected by metadata.  Developed analysis definition parser using a tree traversal algorithm to display variable derivation and identify circular and other inaccurate references.

CDISC

Implementation Strategy. Helped design the implementation strategy for creating SDTM domains. This involved consideration of how to meet SDTM needs while minimizing the impact on existing data management systems. Design specifications covered data requirements, creation of specialized metadata, and identification of tools for building and validating the domains.

Tool Building. Using metadata (see above) created tools to automate the generation of both common domains (AE, DM, etc.) as well as more specialized data (domain-level SuppQual). Tools also included automated validation of domain content, checking metadata for inconstencies, and ensuring that raw data was transformed into domains correctly.

Define.XML. Wrote macros and identified metadata needed for creation of CDISC-compliant Define.XML.  Also enhanced the CDISC-supplied XSL file to render the XML more effectively.

CDISC Users Group. See "Other Projects and Activities," below.

Tables, Figures, Listings (TFLs)

Generation of TFLs. Created tables, figures, and listings in support of NDAs. Wrote macros as needed to expedite repeatable tasks.

Create Analysis Data Sets. Created data sets used for statistical analysis. As with TFLs, wrote macros as the need arose.

Metadata and Program "Shells". Designed and coded data-driven system to assist TFL creation. Tables ("metadata") containing display titles, patient populations, and other information were processed by a macro which wrote macro variables containing formatted titles, footnotes, and SAS statements used to select data from analysis data sets. Display production time was cut significantly, and consistency of output was also improved.

Shift Tables. Wrote a generalized shift tables macro, with user options to control the format and location of tables, handling of missing values, etc.

Patient Profiles. Wrote programs and macros to produce patient profiles in PDF format. Also wrote a utility to generate Web pages to assist viewing the profiles.

Web Pages to Display Metadata, TFLs.  This set of Web pages coordinated the display of TFL metadata by displaying key variables and inserting links, along with file date-times, to each TFL's source data, Log, RTF/PDF, and other related files.

Other Applications

"Corporate Memory" Viewer. Designed and wrote an application to capture memos, emails, programs, etc. of common interest to project team members. The application uses keywords and other classifiers to create an HTML frameset that guides the user through the materials.

File "Stacker". Wrote a Microsoft Word macro to combine an arbitrary number of plain text and/or RTF files into a single Word document with an optional, linkable Table of Contents.

RTF Parser. Wrote a macro to read RTF files, searching for header and Table of Contents fields. The program uses this information to build an HTML frameset that acts as a viewer for the RTFs, using the files' header and TOC settings as the link text.

Legal Document Automation. Designed and coded a Microsoft Word macro to facilitate completion and comparison of various standard legal forms. Part of this process required use of the I.R.I.S. handheld scanner's OCR capabilities.

Legacy System Conversion. Converted SAS-based MVS system to HP Unix. Project requirements included translation of JCL to Unix commands and creation of an HTML front end to dynamically produce reports that were previously static or not readily producible.

Utilities

Developed numerous utilities to support the activities described above. Among these:

Web Page Generators. Macros to create pieces of HTML code such as HREFs, list boxes, and groups of check boxes. Also created documentation to supplement SAS and other sources' explanation of the relationships between ODS templates, HTML elements, and tagsets.

PDF Generation. Generalized program to convert a plain text file into a PDF.

Directory and File Information. Creates data sets containing directory and file name, size, and refresh dates based on user-specified parameters.

Generalized Program Setup. This metadata-driven macro centralized all library and file allocations, option paths, and the like. It produced formatted text files to report actions taken by the macro.

Macro Variable Listing. Formatted names and values of global macro variables in a readable display written to the SAS Log.

Macro Cross Reference. The macro created HTML output with cross references of macro usage. It answered a vital question for developers: "given a directory containing SAS macros (by default, the SASAUTOS path), where in a second directory specification are those macros used?"

Error Sniffer. Created a Web page containing links to Log files in one or more directories. Problematic lines (containing errors, warnings, etc.) were listed. This tool expedited review of Log files - an important consideration when potentially dozens of programs are run (and rerun, and rerun, ...) at a time.

Dataset-Related. Wrote utilities to: identify single-valued variables in a data set; count observations in a data set; create lists of data set variables; and write brief, PROC CONTENTS-like output to the SAS Log.

Miscellaneous. Among the many other macros developed for the above applications are those that: count items in a list; quote list elements; compare tokens from two macro variables, then count and list common or unique items; save/restore system options; time processes; compare attributes of like-named variables in a directory and report differences.

Other Projects and Activities

CodeCrafters Web Site. Designed and wrote all pages of my company's web site, www.CodeCraftersInc.com.

SAS Utilities Course. Developed Building Utilities: Putting Dictionary Tables, SQL, and the Macro Language to Work. This one-day course will present essential features of SAS tools needed to develop solid, robust utilities. It also contains a utilities Best Practices section, summarizing key points for reliable program design.

Research Triangle Users Group (RTSUG). Participated in the RTSUG steering committee, setting up meetings and contributing to the design and production of the group's newsletter.

Research Triangle Park CDISC Users Network. Given the increase in the pharmaceutical industry's interest in CDISC initiatives, it seemed to some of us in the RTP area that a user group would be helpful. I was a co-founder of the group, which had its first meeting in August, 2005.

Papers. Presented numerous papers at local, regional, and international SAS user group conferences. Refer to the publications page for abstracts and complete text.

Recent new papers include:

·             Controlling Macro Output, or “What Happens in the Macro, Stays in the Macro”

·             The Design and Use of Metadata: Part Fine Art, Part Black Art

·             Welcome to the Barnyard, or Thoughts on Etiquette in Professional Venues

·             A Tour of the SAS Reporting Toolbox

·             Dictionary Tables: Essential Tools for Serious Applications

·             Rules for Tools: The SAS Utility Primer

·             Design and Coding of Simple Utility Macros

Old reliable "Road Show" papers recently presented include:

·             The SAS Debugging Primer

·             Program Comprehension: A Strategy for the Bewildered

Mentoring. While not a project or activity per se, this is important on a personal level and thus warrants inclusion in this list. Invariably, once I have been at a client site for a couple of weeks, I begin to get visits from people not assigned to my projects. They'll start their comments with "I know this could be done better" or "is there way to do ...?" and demonstrate an eagerness to be exposed to new ideas. The informal training that ensues is gratifying and satisfying to me, and has benefits for the client, since the people raising the question receive effective, highly-focused, one-on-one advice. Everyone comes out ahead.

The Toolset

Key tools used in the above are:

SAS. Heavy use of DATA step, SQL and REPORT procedures, ODS, and the macro language. Read disparate file formats (Oracle, MS Access, MS Excel) and write to multiple output file types (RTF, PDF, HTML). Also, heavy use of AF, GRAPH, and INTRNET products.

Web Apps. JavaScript, HTML, CSS, XML, XSLT, Xpath, XML Schema

Presentation Tools. Extensive use of Word (macros using VBScript), Visio, Acrobat, and PowerPoint.