The Missing Layer in your SCE: Why AI in Pharma Needs Skills, Not Just Prompts
AI is rapidly changing how statistical programmers work. Since I joined Posit this March, it’s the conversation I keep having with pharma teams. How to ground AI in enterprise context. How to make it reproducible. How to make it auditable. How to make it defensible.
Here is an example workflow that shows how it would look in practice in an AI-enabled SCE (Statistical Computing Environment). One prompt. One TLF reference from the Statistical Analysis Plan (SAP). A governed pipeline does the rest: reading the SAP, extracting a reviewable spec, pausing for statistician approval, generating code, running it. All orchestrated through skills that know enterprise context, not another generic coding assistant.
The Problem with Generic AI Coding Tools in Pharma
Every major AI coding assistant can write R code. Ask Claude Code or GitHub Copilot to produce an rtables demographics table and it will. The code might even run.
But pharma statistical programming is not just about code that runs. It is about code that can be defended: to a regulator, to a QC reviewer, to an auditor asking questions years after database lock.
Generic AI tools have no “home”. They generate code that lands in a programmer’s clipboard with no lineage, no traceability, no locked environment. The programmer pastes it into a script, runs it, and hopes.
That is not a workflow. That is a risk.
Skills: Enterprise Knowledge Built Into the Assistant
Posit Assistant supports a skills system: modular knowledge units that live inside your project and load automatically based on what you ask.
Unlike a system prompt or a chat context, skills are:
- Version controlled and governed: they live in
.posit/assistant/skills/alongside your code - Composable: multiple skills load together for complex tasks
- Project-scoped: the knowledge travels with the project, not with a person
For this demo, the project contains four skills:
.posit/assistant/skills/
├── sap-reader/ — reads the SAP, applies ICH E3 section-14 numbering
├── tlf-generator/ — knows your enterprise coding conventions
├── clinical-governance/ — knows audit, traceability, reproducibility rules
└── tlf-orchestrator/ — coordinates all three end to endNone of these skills contain study-specific content. They encode how to work, not what a specific study contains. Drop in any SAP from any sponsor and the skills work the same way.
The Orchestrator: One Prompt, One or Many TLFs
The tlf-orchestrator skill is the entry point. It ties everything together.
When a programmer types:
Generate T14.1.3.1.1The orchestrator takes over. Here is what happens next, automatically.
The Workflow in Five Steps
Step 1: Open your SAP and identify what you need
Open the latest version of your SAP in Positron. You can browse the shells section to find the table or figure you want, or ask Posit Assistant directly:
What are the Priority 1 TLFs I need to generate from this SAP?The sap-reader skill reads the SAP and applies priority logic to produce a prioritized manifest. Disposition, demographics, primary endpoint, and key safety tables always come first. The statistician reviews and confirms the scope. Nothing generates without that confirmation.
Step 2: Ask Posit Assistant to generate one or many TLFs
Generate T14.1.3.1.1or
Generate all active TLFs from run_manifest.csvThe orchestrator reads the SAP, locates the shell for the requested TLF, and extracts the specification.
Step 3: Review the human-readable spec before code runs
Before generating a single line of R code, the assistant pauses and shows you a human-readable YAML specification. Type CONFIRM to proceed, or CANCEL to edit.
tlf_id: T14.1.3.1.1
title: "Summary of Demographic and Baseline Characteristics (mITT)"
sap_section: "6.3.1"
analysis_set:
flag: MITTFL
label: mITT
adam_datasets:
primary: ADSL
variables:
- name: AGE
label: "Age (years)"
type: continuous
stats: [n, mean_sd, median, range]
- name: SEX
label: "Sex"
type: categorical
stats: [count_fraction]
output_format:
decimal_places:
continuous: 1
percentage: 1
governance:
generated_by: "posit_assistant"
sap_reference: "SAP v1.0 Section 6.3.1"This is your moment to apply domain expertise. Check the analysis set. Check the variables. Check the decimal places against the SAP. AI will make mistakes. This is where you catch them before they propagate into code.
A statistician can review this specification without reading R scripts. That is by design.
Step 4: Confirm and get your governed R script
Type CONFIRM.
The clinical-governance skill validates the spec against the actual ADaM data: checking every variable exists before code generation. The tlf-generator skill produces a governed R script:
# =============================================================
# GOVERNANCE HEADER — AUTO-GENERATED BY POSIT ASSISTANT
# =============================================================
# TLF ID : T14.1.3.1.1
# SAP Reference: SAP v1.0 Section 6.3.1
# YAML Spec : specs/T14.1.3.1.1.yaml
# SHA-256 : 3a7f2c...
# Analysis Set : MITTFL = "Y" (mITT)
# Generated at : 2026-06-25 14:32:01
# renv.lock : 8b4e1d...
# =============================================================Every script is born with its lineage attached. SAP section. YAML hash. Generation timestamp. renv.lock fingerprint. The code knows where it came from.
Step 5: Execute and review your outputs
The script runs. Outputs land in output/tables/ and output/figures/ in three formats:
.txt— legacy, diff-friendly review.html— internal QC review.pdf— CSR and sign-off
Every run is logged to output/audit_log.csv:
timestamp, tlf_id, status, r_version, user
2026-06-25 14:32:05, T14.1.3.1.1, SUCCESS, 4.4.1, shaque
2026-06-25 14:32:09, T14.3.1.4, SUCCESS, 4.4.1, shaque
2026-06-25 14:32:14, F14.2.2.3.2, SUCCESS, 4.4.1, shaqueThe audit trail is the proof. Same inputs, same renv.lock, same output. Reproducible eighteen months from now.
Why This Is Different
| Capability | Generic AI Tool | Posit Assistant + Skills |
|---|---|---|
| Generates R code | ✓ | ✓ |
| Enterprise context loaded automatically | — | ✓ via skills |
| Human-reviewable spec before code | — | ✓ YAML review step |
| Code traceable to SAP section | — | ✓ governance header |
| Package environment locked | — | ✓ renv and containers |
| Audit log of every execution | — | ✓ audit_log.csv |
| Validation-ready compute environment | — | ✓ Posit SCE |
Generic AI tools generate code. Posit Assistant generates governed code inside the environment where it needs to live.
Still a Lot of Open Questions
This demo shows one possible direction. It does not solve everything.
How do you handle SAP amendments? Do you regenerate specs from scratch, or diff them? What does QC look like when AI writes the first draft? Does double programming still make sense? Where does statistical judgment live in an automated pipeline?
These are the conversations I find most interesting right now.
Watch the Demo
SAP sourced from ClinicalTrials.gov (NCT04276558). All patient data synthetic.
Get in Touch
If you are thinking about the same problems, or already building something in this space, I would love to hear how you are approaching it.