AI and machine learning

The Missing Layer in your SCE: Why AI in Pharma Needs Skills, Not Just Prompts

Written by Samiul Haque

2026-06-26

The Missing Layer in your SCE: Why AI in Pharma Needs Skills, Not Just Prompts

AI is rapidly changing how statistical programmers work. Since I joined Posit this March, it’s the conversation I keep having with pharma teams. How to ground AI in enterprise context. How to make it reproducible. How to make it auditable. How to make it defensible.

Here is an example workflow that shows how it would look in practice in an AI-enabled SCE (Statistical Computing Environment). One prompt. One TLF reference from the Statistical Analysis Plan (SAP). A governed pipeline does the rest: reading the SAP, extracting a reviewable spec, pausing for statistician approval, generating code, running it. All orchestrated through skills that know enterprise context, not another generic coding assistant.

The Problem with Generic AI Coding Tools in Pharma

Every major AI coding assistant can write R code. Ask Claude Code or GitHub Copilot to produce an rtables demographics table and it will. The code might even run.

But pharma statistical programming is not just about code that runs. It is about code that can be defended: to a regulator, to a QC reviewer, to an auditor asking questions years after database lock.

Generic AI tools have no “home”. They generate code that lands in a programmer’s clipboard with no lineage, no traceability, no locked environment. The programmer pastes it into a script, runs it, and hopes.

That is not a workflow. That is a risk.

Skills: Enterprise Knowledge Built Into the Assistant

Posit Assistant supports a skills system: modular knowledge units that live inside your project and load automatically based on what you ask.

Unlike a system prompt or a chat context, skills are:

Version controlled and governed: they live in .posit/assistant/skills/ alongside your code
Composable: multiple skills load together for complex tasks
Project-scoped: the knowledge travels with the project, not with a person

For this demo, the project contains four skills:

.posit/assistant/skills/
├── sap-reader/          — reads the SAP, applies ICH E3 section-14 numbering
├── tlf-generator/       — knows your enterprise coding conventions
├── clinical-governance/ — knows audit, traceability, reproducibility rules
└── tlf-orchestrator/    — coordinates all three end to end

None of these skills contain study-specific content. They encode how to work, not what a specific study contains. Drop in any SAP from any sponsor and the skills work the same way.

The Orchestrator: One Prompt, One or Many TLFs

The tlf-orchestrator skill is the entry point. It ties everything together.

When a programmer types:

Generate T14.1.3.1.1

The orchestrator takes over. Here is what happens next, automatically.

The Workflow in Five Steps

Step 1: Open your SAP and identify what you need

Open the latest version of your SAP in Positron. You can browse the shells section to find the table or figure you want, or ask Posit Assistant directly:

What are the Priority 1 TLFs I need to generate from this SAP?

The sap-reader skill reads the SAP and applies priority logic to produce a prioritized manifest. Disposition, demographics, primary endpoint, and key safety tables always come first. The statistician reviews and confirms the scope. Nothing generates without that confirmation.

Step 2: Ask Posit Assistant to generate one or many TLFs

Generate T14.1.3.1.1

Generate all active TLFs from run_manifest.csv

The orchestrator reads the SAP, locates the shell for the requested TLF, and extracts the specification.

Step 3: Review the human-readable spec before code runs

Important

Before generating a single line of R code, the assistant pauses and shows you a human-readable YAML specification. Type CONFIRM to proceed, or CANCEL to edit.

tlf_id: T14.1.3.1.1
title: "Summary of Demographic and Baseline Characteristics (mITT)"
sap_section: "6.3.1"
analysis_set:
  flag: MITTFL
  label: mITT
adam_datasets:
  primary: ADSL
  variables:
    - name: AGE
      label: "Age (years)"
      type: continuous
      stats: [n, mean_sd, median, range]
    - name: SEX
      label: "Sex"
      type: categorical
      stats: [count_fraction]
output_format:
  decimal_places:
    continuous: 1
    percentage: 1
governance:
  generated_by: "posit_assistant"
  sap_reference: "SAP v1.0 Section 6.3.1"

This is your moment to apply domain expertise. Check the analysis set. Check the variables. Check the decimal places against the SAP. AI will make mistakes. This is where you catch them before they propagate into code.

A statistician can review this specification without reading R scripts. That is by design.

Step 4: Confirm and get your governed R script

Type CONFIRM.

The clinical-governance skill validates the spec against the actual ADaM data: checking every variable exists before code generation. The tlf-generator skill produces a governed R script:

# =============================================================
# GOVERNANCE HEADER — AUTO-GENERATED BY POSIT ASSISTANT
# =============================================================
# TLF ID       : T14.1.3.1.1
# SAP Reference: SAP v1.0 Section 6.3.1
# YAML Spec    : specs/T14.1.3.1.1.yaml
# SHA-256      : 3a7f2c...
# Analysis Set : MITTFL = "Y" (mITT)
# Generated at : 2026-06-25 14:32:01
# renv.lock    : 8b4e1d...
# =============================================================

Every script is born with its lineage attached. SAP section. YAML hash. Generation timestamp. renv.lock fingerprint. The code knows where it came from.

Step 5: Execute and review your outputs

The script runs. Outputs land in output/tables/ and output/figures/ in three formats:

.txt — legacy, diff-friendly review
.html — internal QC review
.pdf — CSR and sign-off

Every run is logged to output/audit_log.csv:

timestamp,           tlf_id,        status,  r_version, user
2026-06-25 14:32:05, T14.1.3.1.1,   SUCCESS, 4.4.1,     shaque
2026-06-25 14:32:09, T14.3.1.4,     SUCCESS, 4.4.1,     shaque
2026-06-25 14:32:14, F14.2.2.3.2,   SUCCESS, 4.4.1,     shaque

The audit trail is the proof. Same inputs, same renv.lock, same output. Reproducible eighteen months from now.

Why This Is Different

Capability	Generic AI Tool	Posit Assistant + Skills
Generates R code	✓	✓
Enterprise context loaded automatically	—	✓ via skills
Human-reviewable spec before code	—	✓ YAML review step
Code traceable to SAP section	—	✓ governance header
Package environment locked	—	✓ renv and containers
Audit log of every execution	—	✓ audit_log.csv
Validation-ready compute environment	—	✓ Posit SCE

Generic AI tools generate code. Posit Assistant generates governed code inside the environment where it needs to live.

Still a Lot of Open Questions

This demo shows one possible direction. It does not solve everything.

How do you handle SAP amendments? Do you regenerate specs from scratch, or diff them? What does QC look like when AI writes the first draft? Does double programming still make sense? Where does statistical judgment live in an automated pipeline?

These are the conversations I find most interesting right now.

Watch the Demo

Note

SAP sourced from ClinicalTrials.gov (NCT04276558). All patient data synthetic.

Get in Touch

If you are thinking about the same problems, or already building something in this space, I would love to hear how you are approaching it.

samiul.haque@posit.co

Samiul Haque

Senior Solutions Advisor, Posit

The Missing Layer in your SCE: Why AI in Pharma Needs Skills, Not Just Prompts

The Problem with Generic AI Coding Tools in Pharma

Skills: Enterprise Knowledge Built Into the Assistant

The Orchestrator: One Prompt, One or Many TLFs

The Workflow in Five Steps

Step 1: Open your SAP and identify what you need

Step 2: Ask Posit Assistant to generate one or many TLFs

Step 3: Review the human-readable spec before code runs

Step 4: Confirm and get your governed R script

Step 5: Execute and review your outputs

Why This Is Different

Still a Lot of Open Questions

Watch the Demo

Get in Touch

Samiul Haque

Related Content

Governed AI for Public Health: Reading Free-Text Records with Snowflak...

How Posit Team governs AI for public sector and government

Don't bring a spreadsheet to a data fight