CV
William (Bill) Watson
Summary
Research on autonomous, adaptive AI agents for asymmetric environments that personalize at test-time, with a focus on privacy, learning from limited feedback, dynamic memory systems, and maintaining diverse, uncorrelated behaviors.
Education
- M.S.E. in Computer Science2019-05-01The Johns Hopkins UniversityCourses: Computer Vision, Machine Translation, Human-Robot Interaction, Deep Learning, Artificial Intelligence
- B.S. in Computer Science; Minor in Applied Mathematics & Statistics2018-05-01The Johns Hopkins UniversityCourses: Computer Vision, Machine Translation, Human-Robot Interaction, Deep Learning, Artificial Intelligence
Work Experience
- AI Scientist, Core Artificial Intelligence Technology2024-06-01 -Millennium ManagementAgentic systems, retrieval, memory, and evaluation for enterprise LLM platforms.
- AI Research Lead & Vice President, AI Research (Supervisor: Manuela Veloso)2021-12-01 - 2024-05-01JPMorgan Chase & Co.LLM/agent research for legal, retrieval, and reliability systems.
- Developed HalluciBot (AAAI 2025 Oral) for pre-generation hallucination-risk prediction and routing.
- Created LAW (COLING 2025 Oral; U.S. 12,298,970) for interactive querying over legal contracts; powered by FlowMind (ICAIF 2023).
- Introduced HiddenTables (EMNLP 2023) and developed BizGraphQA (SIGIR 2023).
- Data Scientist, Ratings Data Science2019-06-01 - 2021-11-01S&P GlobalMultimodal document understanding, semantic search, and NLP research for financial/risk workflows.
- Built a tabular auto-extractor (ICAIF 2020) to structure tables from image documents.
- Patented a ratings criteria semantic search & citation ranking engine (U.S. Patent 11,328,022).
- NLP research on earnings calls for Women CEOs During COVID-19 (AZBEES Gold Regional Award).
Skills
Programming
- Python
- LaTeX
- SQL
- JavaScript
- HTML/CSS
- Java
- C/C++
- OCaml
Machine Learning & Data Science
- PyTorch
- TensorFlow
- Transformers
- OpenCV
- Scikit-Learn
- Tesseract OCR
Cloud, Web, & DevOps
- MCP
- AWS
- Docker
- Redis
- FastAPI
- React
- Streamlit
- Git
- Gunicorn
- Plotly
- OpenSearch/Elasticsearch
Research Interests
- Reinforcement Learning
- Retrieval-Augmented Generation (RAG)
- Interpretability
- Embodied AI
- Agents
Publications
- Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing2024AAAI 2025 (Oral - Top 4%)HalluciBot predicts hallucination risk before LLM generation and routes/rewrites queries for safer reasoning.
- LAW: Legal Agentic Workflows for Custody and Fund Services Contracts2024COLING 2025 (Oral - Top 15%)Agentic workflow system for interactive querying over legal contracts.
- FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning2024ICAIF 2024Modular system combining sub-querying and expert swarms for financial intelligence tasks.
- HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies2024EMNLP 2023Privacy-preserving TableQA game/dataset; improves discovery and highlights schema/compositional weaknesses.
- FlowMind: Automatic Workflow Generation with LLMs2024ICAIF 2023 (Best Poster Runner-Up)Automatically generates and refines agentic workflows with LLMs.
- BizGraphQA: A Dataset for Image-based Inference over Graph-structured Diagrams from Business Domains2023
- Financial Table Extraction in Image Documents2024ICAIF 2020 (Oral)Deep learning pipeline for detecting and transcribing financial tables in images while preserving structure.
- Modeling Color Terminology Across Thousands of Languages2019EMNLP-IJCNLP 2019Cross-linguistic modeling of color terminology across many languages.
- QueryBandits for Hallucination Mitigation: Exploiting Semantic Features for No-Regret Rewriting2025NeurIPS 2025 LAW (Workshop)Contextual bandits for adaptive query rewriting to mitigate hallucinations.
- MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers2025AAAI 2025 PDLM (Workshop - Oral)Robustness analysis using automated perturbations and answer collection.
- BuDDIE: A Business Document Dataset for Multi-task Information Extraction2024COLING 2025 FinNLP+FNP+LLMFinLegal (Workshop - Oral)Multi-task information extraction dataset for business documents.
- Directed Criteria Citation Recommendation and Ranking Through Link Prediction2024ICAIF 2020 (Extended Abstract)Link prediction approach for recommending and ranking financial citations.
- Customer Experience Focus Can Improve Equity And Credit Performance2021S&P Global Special ReportsReport on the relationship between customer experience focus and equity/credit performance.
- Leadership In Turbulent Times: Women CEOs During COVID-192022S&P Global Special Reports (AZBEES Gold Regional Award)Analysis of leadership strategies of women CEOs during COVID-19 using earnings call data.
- Development of Computer Vision and Image Processing Libraries at the National Synchrotron Light Source - II2016BNL Internship Reports 2016 (U.S. DOE, Brookhaven National Laboratory)CV/ML library development for synchrotron imaging and scientific data processing.
- QUILL: Questions into Layout & Logic, or Learning How to Ask-Then-Draw for Guided Infographic Generation with Controllable Diffusion2026ACL 2026 (Target Submission)Guided infographic generation from QA-derived layout and logic constraints.
- Stop Agreeing: Constrained GRPO with Answerer Ensembles for Sycophancy-Resistant Two-Stage QA2026ACL 2026 (Target Submission)Constrained GRPO approach to reduce sycophancy and prompt-lottery variance in two-stage QA.
- Keeping TABs: Genetic Memory for Online Continual Learning with Thompson-Sampled Agentic Buffers2026ACL 2026 (Target Submission)Thompson-sampled agentic buffers for genetic memory in online continual learning.
- No One Size Fits All: QueryBandits for Hallucination Mitigation2026ICLR 2026 (Under Review)Online per-query rewrites that adapt on the fly to reduce hallucinations.
- What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance2026EACL 2026 (Under Review)Measures how human-confusing linguistic features affect LLM performance.
- TASER: Table Agents for Schema-guided Extraction and Recommendation2025EACL 2026 (Under Review)Schema-guided table agents for extraction and recommendation in structured domains.
- System and Method for Generating Order Agnostic Dynamic Citation Network2025U.S. PatentU.S. Patent 12,443,662 • Filed June 14, 2024 • Issued October 14, 2025 • JPMorgan Chase & Co.
- Method and System for Automated Processing and Continuous Deployment of Subpoena Extraction Model2025U.S. PatentU.S. Patent 12,437,572 • Filed November 21, 2022 • Issued October 7, 2025 • JPMorgan Chase & Co.
- Method and System for Automatic Workflow Generation by Large Language Models2025U.S. PatentU.S. Patent 12,399,691 • Filed August 24, 2023 • Issued August 26, 2025 • JPMorgan Chase & Co.
- Method and System for Analyzing Natural Language Data by Using Domain-Specific Language Models2025U.S. PatentU.S. Patent 12,298,970 • Filed July 3, 2023 • Issued April 30, 2025 • JPMorgan Chase & Co.
- System for Document Ranking by Phrase Importance2022U.S. PatentU.S. Patent 11,328,022 • Filed March 17, 2020 • Issued April 20, 2022 • S&P Global
- Method and System for Information Extraction and Aggregation2024U.S. Patent ApplicationU.S. Patent App. 18/816,826 • Filed August 27th, 2024 • JPMorgan Chase & Co.
- Method & System of Training an Encoder Classifier Model in Predicting Hallucination of a Machine Learning Model before Generation of a Query2024U.S. Patent ApplicationU.S. Patent App. 18/806,279 • FiledAugust15th,2024 • JPMorganChase&Co.
- System and Method for Automatic Table Identification and Extraction in Documents2024U.S. Patent ApplicationU.S. Patent App. 18/767,115 • Filed July 9, 2024 • JPMorgan Chase & Co.
- Method and System for Improving Code Generation Quality of Large Language Model Through Code Guardrails2024U.S. Patent ApplicationU.S. Patent App. 18/675,688 • Filed May 28, 2024 • JPMorgan Chase & Co.
- System and Method for Vision-Assisted Approach for Graph Structured Extraction In Various Types of Documents2024U.S. Patent ApplicationU.S. Patent App. 18/666,318 • Filed May 16, 2024 • JPMorgan Chase & Co.
- System and Method for Implementing a Model That Predicts the Probability of Hallucination For Any Query Imposed to an LLM2024U.S. Patent ApplicationU.S. Patent App. 18/630,641 • Filed April 9, 2024 • JPMorgan Chase & Co.
- Method and System for Performing Table Question Answering Tasks While Preserving Data Privacy2023U.S. Patent ApplicationU.S. Patent App. 18/535,428 • Filed December 11, 2023 • Allowed November 21, 2025 • JPMorgan Chase & Co.
- Method and System for Evaluating Artificial Intelligence Models Via Perturbations2023U.S. Patent ApplicationU.S. Patent App. 18/396,014 • Filed December 26, 2023 • JPMorgan Chase & Co.
- Method and System for Forecasting Trading Behavior and Thematic Concepts2023U.S. Patent ApplicationU.S. Patent App. 18/377,121 • Filed October 5, 2023 • JPMorgan Chase & Co.
- Method and System for Forecasting Market Activity2023U.S. Patent ApplicationU.S. Patent App. 18/377,116 • Filed October 5, 2023 • JPMorgan Chase & Co.
- Method and System for Code Generation by Large Language ModelsU.S. Patent ApplicationU.S. Patent App. 18/205,719 • Prov. Filed March 22, 2023 • Allowed December 3, 2025 • JPMorgan Chase & Co.
- Method and System for Generation of Insights from Regulatory Filings2023U.S. Patent ApplicationU.S. Patent App. 18/103,807 • Filed January 31, 2023 • JPMorgan Chase & Co.
- Method and System for Automation of Due Diligence2022U.S. Patent ApplicationU.S. Patent App. 17/658,383 • Filed April 7, 2022 • JPMorgan Chase & Co.