Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
patents
Method and System for Automation of Due Diligence
Published:
U.S. Patent App. 17/658,383 (Filed 2022-04-07).
System for Document Ranking by Phrase Importance
Published:
U.S. Patent 11,328,022 (Filed 2020-03-17; Issued 2022-04-20).
Method and System for Generation of Insights from Regulatory Filings
Published:
U.S. Patent App. 18/103,807 (Filed 2023-01-31).
Method and System for Forecasting Market Activity
Published:
U.S. Patent App. 18/377,116 (Filed 2023-10-05).
Method and System for Forecasting Trading Behavior and Thematic Concepts
Published:
U.S. Patent App. 18/377,121 (Filed 2023-10-05).
Method and System for Evaluating Artificial Intelligence Models Via Perturbations
Published:
U.S. Patent App. 18/396,014 (Filed 2023-12-26).
System and Method for Implementing a Model That Predicts the Probability of Hallucination For Any Query Imposed to an LLM
Published:
U.S. Patent App. 18/630,641 (Filed 2024-04-09).
System and Method for Vision-Assisted Approach for Graph Structured Extraction In Various Types of Documents
Published:
U.S. Patent App. 18/666,318 (Filed 2024-05-16).
Method and System for Improving Code Generation Quality of Large Language Model Through Code Guardrails
Published:
U.S. Patent App. 18/675,688 (Filed 2024-05-28).
System and Method for Automatic Table Identification and Extraction in Documents
Published:
U.S. Patent App. 18/767,115 (Filed 2024-07-09).
Method & System of Training an Encoder Classifier Model in Predicting Hallucination of a Machine Learning Model before Generation of a Query
Published:
U.S. Patent App. 18/806,279 (Filed 2024-08-15).
Method and System for Information Extraction and Aggregation
Published:
U.S. Patent App. 18/816,826 (Filed 2024-08-27).
Method and System for Analyzing Natural Language Data by Using Domain-Specific Language Models
Published:
U.S. Patent 12,298,970 (Filed 2023-07-03; Issued 2025-04-30).
Method and System for Automatic Workflow Generation by Large Language Models
Published:
U.S. Patent 12,399,691 (Filed 2023-08-24; Issued 2025-08-26).
Method and System for Automated Processing and Continuous Deployment of Subpoena Extraction Model
Published:
U.S. Patent 12,437,572 (Filed 2022-11-21; Issued 2025-10-07).
System and Method for Generating Order Agnostic Dynamic Citation Network
Published:
U.S. Patent 12,443,662 (Filed 2024-06-14; Issued 2025-10-14).
Method and System for Performing Table Question Answering Tasks While Preserving Data Privacy
Published:
U.S. Patent App. 18/535,428 (Filed 2023-12-11; Allowed 2025-11-21).
Method and System for Code Generation by Large Language Models
Published:
U.S. Patent App. 18/205,719 (Prov. Filed 2023-03-22; Allowed 2025-12-03).
publications
Development of Computer Vision and Image Processing Libraries at the National Synchrotron Light Source - II
Published in BNL Internship Reports 2016 (U.S. DOE, Brookhaven National Laboratory), 2016
Documents CV/ML libraries built at Brookhaven Lab for synchrotron imaging and scientific data processing.
Recommended citation: William Watson and Kazimierz Gofron. (2016). "Development of Computer Vision and Image Processing Libraries at the National Synchrotron Light Source - II." BNL Internship Reports.
Download Paper
Modeling Color Terminology Across Thousands of Languages
Published in EMNLP-IJCNLP 2019, 2019
Cross-linguistic analysis of color terminology across thousands of languages, linking linguistic and perceptual variation.
Recommended citation: Arya D. McCarthy, Winston Wu, Aaron Mueller, William Watson, and David Yarowsky. (2019). "Modeling Color Terminology Across Thousands of Languages." EMNLP-IJCNLP 2019.
Download Paper
Customer Experience Focus Can Improve Equity And Credit Performance
Published in S&P Global Special Reports, 2021
Shows that prioritizing customer experience leads to better equity and credit performance outcomes.
Recommended citation: Sheryl Kingstone, Sudeep Kesh, Jeong Choi, Clayton Davis, William Watson, and Sundaram Iyer. (n.d.). "Customer Experience Focus Can Improve Equity And Credit Performance." S&P Global Special Reports.
Download Paper
Leadership In Turbulent Times: Women CEOs During COVID-19
Published in S&P Global Special Reports (AZBEES Gold Regional Award), 2022
Analyzes leadership strategies of women CEOs during COVID-19 using large-scale earnings call data.
Recommended citation: Daniela Brandazza, Marion Amiot, Katie Darden, William Watson, Gabriel Morin, Rose Marie Burke, Victoria Schumacher, Gaurang Dholakia, Lindsey Hall, Azadeh Nematzadeh, and Nicole Serino. (2022). "Leadership In Turbulent Times: Women CEOs During COVID-19." S&P Global Special Reports.
Download Paper
BizGraphQA: A Dataset for Image-based Inference over Graph-structured Diagrams from Business Domains
Published in SIGIR 2023, 2023
Provides a dataset enabling question answering over complex business diagrams through graph-structured inference.
Recommended citation: Petr Babkin, William Watson, Zhiqiang Ma, Lucas Cecchi, Natraj Raman, Armineh Nourbakhsh, and Sameena Shah. (2023). "BizGraphQA: A Dataset for Image-based Inference over Graph-structured Diagrams from Business Domains." SIGIR 2023.
Download Paper
Directed Criteria Citation Recommendation and Ranking Through Link Prediction
Published in ICAIF 2020 (Extended Abstract), 2024
Applies link prediction to recommend and rank financial citations through graph-based methods.
Recommended citation: William Watson and Lawrence Yong. (2020). "Directed Criteria Citation Recommendation and Ranking Through Link Prediction." ICAIF 2020 Extended Abstract.
Download Paper
BuDDIE: A Business Document Dataset for Multi-task Information Extraction
Published in COLING 2025 FinNLP+FNP+LLMFinLegal (Workshop - Oral), 2024
Introduces BuDDIE, a large-scale dataset of business documents annotated for multiple information extraction tasks.
Recommended citation: Dongsheng Wang, Ran Zmigrod, Mathieu J. Sibue, Yulong Pei, Petr Babkin, Ivan Brugere, Xiaomo Liu, Nacho Navarro, Antony Papadimitriou, William Watson, Zhiqiang Ma, Armineh Nourbakhsh, and Sameena Shah. (2025). "BuDDIE: A Business Document Dataset for Multi-task Information Extraction." COLING 2025 FinNLP+FNP+LLMFinLegal.
Download Paper
FlowMind: Automatic Workflow Generation with LLMs
Published in ICAIF 2023 (Best Poster Runner-Up), 2024
FlowMind leverages LLMs to automatically generate and refine agentic workflows, outperforming baseline workflow systems.
Recommended citation: Zhen Zeng, William Watson, Nicole Cho, Saba Rahimi, Shayleen Reynolds, Tucker Balch, and Manuela Veloso. (2023). "FlowMind: Automatic Workflow Generation with LLMs." ICAIF 2023.
Download Paper
Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing
Published in AAAI 2025 (Oral - Top 4% of Submissions), 2024
HalluciBot predicts hallucination risk before LLM generation, rewriting and routing queries to safer reasoning paths.
Recommended citation: William Watson, Nicole Cho, and Nishan Srishankar. (2025). "Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing." AAAI 2025.
Download Paper
Financial Table Extraction in Image Documents
Published in ICAIF 2020 (Oral), 2024
Presents a deep learning pipeline to detect, extract, and transcribe financial tables in images while preserving structure.
Recommended citation: William Watson and Bo Liu. (2020). "Financial Table Extraction in Image Documents." ICAIF 2020.
Download Paper
HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies
Published in EMNLP 2023, 2024
Introduces HiddenTables and PyQTax for privacy-preserving TableQA; highlights LLM weaknesses on schema alignment and compositional queries.
Recommended citation: William Watson, Nicole Cho, Tucker Balch, and Manuela Veloso. (2023). "HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies." EMNLP 2023.
Download Paper
FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning
Published in ICAIF 2024, 2024
Designs a modular system combining sub-querying, neural conditioning, and expert swarms to improve financial intelligence tasks.
Recommended citation: Nicole Cho, Nishan Srishankar, Lucas Cecchi, and William Watson. (2024). "FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning." ICAIF 2024.
Download Paper
LAW: Legal Agentic Workflows for Custody and Fund Services Contracts
Published in COLING 2025 (Oral - Top 15%), 2024
LAW orchestrates modular agents and domain-specific tools to automate complex legal contract workflows with high accuracy.
Recommended citation: William Watson, Nicole Cho, Nishan Srishankar, Zhen Zeng, Lucas Cecchi, Daniel Scott, Suchetha Siddagangappa, Rachneet Kaur, Tucker Balch, and Manuela Veloso. (2025). "LAW: Legal Agentic Workflows for Custody and Fund Services Contracts." COLING 2025.
Download Paper
MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers
Published in AAAI 2025 PDLM (Workshop - Oral), 2025
Designs a perturbation analysis to benchmark LLM robustness by sampling perturbed questions and answers.
Recommended citation: Nicole Cho and William Watson. (2025). "MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers." AAAI 2025 PDLM.
Download Paper
MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers
Published in AAAI 2025 PDLM (Workshop - Oral), 2025
Designs a perturbation analysis to benchmark LLM robustness by sampling perturbed questions and answers.
Recommended citation: Nicole Cho and William Watson. (2025). "MultiQ&A: An Analysis in Measuring Robustness via Automated Crowdsourcing of Question Perturbations and Answers." AAAI 2025 PDLM.
Download Paper
TASER: Table Agents for Schema-guided Extraction and Recommendation
Published in EACL 2026 (Under Review), 2025
Proposes schema-guided table agents for extraction and recommendation in financial and structured data domains.
Recommended citation: Nicole Cho, Kirsty Fielding, William Watson, Sumitra Ganesh, and Manuela Veloso. (2026). "TASER: Table Agents for Schema-guided Extraction and Recommendation." EACL 2026 (Under Review).
Download Paper
No One Size Fits All: QueryBandits for Hallucination Mitigation
Published in ICLR 2026 (Under Review), 2026
Online per-query rewrites adapt on the fly, prove no one-size-fits-all, and cut hallucinations on closed LLMs.
Recommended citation: Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, and Manuela Veloso. (2026). "No One Size Fits All: QueryBandits for Hallucination Mitigation." ICLR 2026 (Under Review).
Download Paper
