Overview
Information Commons provides user-friendly tools to explore, define, and analyze cohorts of de-identified clinical data—from no-code discovery to advanced analysis.
IC Documentation Copilot (New!)
AI-powered documentation assistant for Information Commons data
The Information Commons (IC) Documentation Copilot is an AI assistant that helps UCSF researchers find, understand, and use Information Commons data, tools, and environments more efficiently.
Embedded within Versa Chat, the Copilot turns IC documentation—Slack Q&A, wiki pages, and data dictionaries—into practical, UCSF-specific answers with citations, helping researchers spend less time searching for information and more time doing analysis.
The Copilot can help with:
- Orientation across IC data assets, tools, and environments (e.g., DEID CDW vs. OMOP, PatientExploreR vs. ATLAS, RAE vs. IC FAC vs. IC AWS Secure)
- Conceptual data questions, including data provenance, clinical workflow context, and best practices for cohort definition
- Applied “how-to” guidance, such as table selection, join logic, and example query patterns (when appropriate)
Under the hood, the IC Documentation Copilot uses retrieval-augmented generation (RAG) to ground responses in curated IC sources, returning cited answers that link back to the original documentation so users can verify details and explore further.
Access: The IC Documentation Copilot is available to registered IC users as Versa Assistant within Versa Chat. Users without Versa Chat access can contact [email protected]
Learn more: IC Documentation Copilot Wiki
ICoN - Coming Soon!
ICoN (Information Commons Navigator)— A conversational AI interface that lets researchers explore IC Data (currently, UCSF CDW and UCSF+SFDPH OMOP) using natural language. ICoN generates and executes queries, summarizes results, recommends analytic approaches, and connects to the SPOKE Knowledge Graph for biologically informed discovery -- reducing the gap between research question and reproducible analysis.
Developed by Bill Santo, ARS Informatics - Research Data Services Team
UCSF BioRouter - Coming Soon!
A locally installed, secure, health science–enabled integrated development environment that brings generative AI into clinical and research workflows. BioRouter connects UCSF’s institutionally hosted ChatGPT with various data services, including IC OMOP and SPOKE knowledge graph data, enabling everything from natural-language exploration of EHR data and clinical notes to advanced, data-aware software development for academic research. Users can design, and share customized workflows and extensions, making it easy to collaborate, reproduce analyses, and build reusable tools while keeping sensitive patient and research data protected.
Created by Wanjun Gu, Baranzini Lab
BRIM (Biomedical Research Information Miner) - Coming Soon!
BRIM is an institutionally hosted, web-based platform for AI-assisted chart abstraction and data curation in clinical research. It allows teams to define variables of interest, apply AI-assisted abstraction to clinical notes and structured data, review and refine outputs, and export high-quality structured datasets for downstream analysis. BRIM is designed for flexible, human-in-the-loop workflows that scale across projects while maintaining transparency and data quality.
Best for / Use cases:
Chart abstraction for clinical research • Registry development and maintenance • Cohort identification and feasibility analysis • Retrospective studies using unstructured clinical notes
Developed at Vanderbilt University and Brim Analytics; implemented at UCSF by Academic Research Services in partnership with the UCSF Cancer Center.
DiveEHR — Semantic AI Reasoning over Clinical Data - Coming Soon!
DiveEHR is an advanced, RAG-based semantic Generative AI system designed to support deep reasoning over complex clinical data, including both structured data and unstructured clinical notes. It enables researchers to query large-scale clinical datasets using natural language, retrieve evidence-grounded results with source transparency, and synthesize information across modalities at scale.
Built with regulatory compliance, performance, and auditability in mind, DiveEHR combines optimized data embeddings, semantic search, and reasoning workflows to support rigorous, reproducible research. The system is designed to operate over institutionally governed clinical data and to provide clear provenance for retrieved information, enabling researchers to evaluate and validate results.
DiveEHR is being implemented to support researcher access via IC FAC, enabling high-impact use cases such as cohort discovery, trial feasibility assessment, and data-intensive clinical research workflows that require a comprehensive review of both structured records and clinical notes.
Best for / Use cases:
Cohort discovery • Trial feasibility and recruitment analysis • Retrospective studies using clinical notes • Large-scale clinical data exploration and reasoning
Developed by Rohit Vashisht and the Atul Butte Lab, with support from BCHSI, UC Health CDI2 (Center for Data-Driven Insights and Innovation), and Academic Research Services (ARS).
PatientExploreR
Web-based cohort discovery and exploration
PatientExploreR is an intuitive, web-based application for searching and exploring de-identified UCSF clinical data. Researchers can define patient cohorts using a point-and-click interface, visualize cohort characteristics with interactive plots, and download patient-level or cohort-level data for further analysis.
Key features:
- No-code cohort selection and refinement
- Interactive visualizations for rapid exploration
- Downloadable data for downstream analysis
Underlying data: PEDB, including data from UCSF Health and Fresno CHS
Learn More & Access: PatientExploreR Wiki (VPN required)
ATLAS (OHDSI)
OMOP-based cohort design and analysis
ATLAS is a web-based tool developed by the OHDSI community for designing and executing analyses on standardized observational health data in the OMOP Common Data Model. ATLAS supports reproducible research and enables collaboration across institutions using OMOP-formatted data.
Key features:
- Cohort definition using standardized vocabularies
- Population-level characterization and analysis
- Portability and reproducibility across OMOP sites
Underlying data: UCSF DeID OMOP, UCSF–SFDPH DeID OMOP
Learn More & Access: IC Tools Wiki (VPN required)
EMERSE
Clinical notes search and text exploration
EMERSE is a search and text-processing tool designed to help researchers identify patients and clinical concepts within de-identified free-text clinical notes. It includes basic natural language processing (NLP) capabilities while remaining easy to use and requiring minimal training.
Key features:
- Keyword and concept-based search of clinical notes
- Rapid identification of relevant patient cohorts
- Designed for quick, exploratory text analysis
Underlying data: UCSF De-identified Clinical Notes
Learn More & Access: IC Tools Wiki (VPN required)
MIX (Medical Image Explorer)
Imaging data exploration and cohort selection
MIX is a user-friendly image explorer available via the IC FAC environment. It enables researchers to explore de-identified radiology imaging data and perform imaging-based cohort selection in a secure environment.
Key features:
- Interactive exploration of de-identified imaging data
- Imaging-based cohort discovery
- Linked access to imaging metadata and EHR context
Underlying data: UCSF De-Identified Radiology Images included in Imaging Commons (available on IC FAC environment)
Learn More & Access: IC Tools Wiki (VPN required)
UCSF cBioPortal
Cancer genomic data exploration and visualization
UCSF cBioPortal is a user-friendly, web-based tool for exploring and visualizing cancer genomic testing data derived from UCSF’s de-identified clinical data. It is available through the UCSF Cancer Center’s Molecular Oncology Initiative and supports interactive analysis of genomic alterations in cancer cohorts.
Key features:
- Visualization of cancer genomic testing results
- Exploration of gene-level and cohort-level alterations
- Integration with the de-identified clinical context
Underlying data: UCSF DeID CDW – Cancer Genomic Testing Data
Access: Available via UCSF Cancer Center’s Molecular Oncology Initiative