Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

patents

publications

Development of Computer Vision and Image Processing Libraries at the National Synchrotron Light Source - II

Published in BNL Internship Reports 2016 (U.S. DOE, Brookhaven National Laboratory), 2016

Documents CV/ML libraries built at Brookhaven Lab for synchrotron imaging and scientific data processing.

Recommended citation: William Watson and Kazimierz Gofron. (2016). "Development of Computer Vision and Image Processing Libraries at the National Synchrotron Light Source - II." BNL Internship Reports.
Download Paper

Modeling Color Terminology Across Thousands of Languages

Published in EMNLP-IJCNLP 2019, 2019

Cross-linguistic analysis of color terminology across thousands of languages, linking linguistic and perceptual variation.

Recommended citation: Arya D. McCarthy, Winston Wu, Aaron Mueller, William Watson, and David Yarowsky. (2019). "Modeling Color Terminology Across Thousands of Languages." EMNLP-IJCNLP 2019.
Download Paper

Customer Experience Focus Can Improve Equity And Credit Performance

Published in S&P Global Special Reports, 2021

Shows that prioritizing customer experience leads to better equity and credit performance outcomes.

Recommended citation: Sheryl Kingstone, Sudeep Kesh, Jeong Choi, Clayton Davis, William Watson, and Sundaram Iyer. (n.d.). "Customer Experience Focus Can Improve Equity And Credit Performance." S&P Global Special Reports.
Download Paper

Leadership In Turbulent Times: Women CEOs During COVID-19

Published in S&P Global Special Reports (AZBEES Gold Regional Award), 2022

Analyzes leadership strategies of women CEOs during COVID-19 using large-scale earnings call data.

Recommended citation: Daniela Brandazza, Marion Amiot, Katie Darden, William Watson, Gabriel Morin, Rose Marie Burke, Victoria Schumacher, Gaurang Dholakia, Lindsey Hall, Azadeh Nematzadeh, and Nicole Serino. (2022). "Leadership In Turbulent Times: Women CEOs During COVID-19." S&P Global Special Reports.
Download Paper

BizGraphQA: A Dataset for Image-based Inference over Graph-structured Diagrams from Business Domains

Published in SIGIR 2023, 2023

Provides a dataset enabling question answering over complex business diagrams through graph-structured inference.

Recommended citation: Petr Babkin, William Watson, Zhiqiang Ma, Lucas Cecchi, Natraj Raman, Armineh Nourbakhsh, and Sameena Shah. (2023). "BizGraphQA: A Dataset for Image-based Inference over Graph-structured Diagrams from Business Domains." SIGIR 2023.
Download Paper

BuDDIE: A Business Document Dataset for Multi-task Information Extraction

Published in COLING 2025 FinNLP+FNP+LLMFinLegal (Workshop - Oral), 2024

Introduces BuDDIE, a large-scale dataset of business documents annotated for multiple information extraction tasks.

Recommended citation: Dongsheng Wang, Ran Zmigrod, Mathieu J. Sibue, Yulong Pei, Petr Babkin, Ivan Brugere, Xiaomo Liu, Nacho Navarro, Antony Papadimitriou, William Watson, Zhiqiang Ma, Armineh Nourbakhsh, and Sameena Shah. (2025). "BuDDIE: A Business Document Dataset for Multi-task Information Extraction." COLING 2025 FinNLP+FNP+LLMFinLegal.
Download Paper

FlowMind: Automatic Workflow Generation with LLMs

Published in ICAIF 2023 (Best Poster Runner-Up), 2024

FlowMind leverages LLMs to automatically generate and refine agentic workflows, outperforming baseline workflow systems.

Recommended citation: Zhen Zeng, William Watson, Nicole Cho, Saba Rahimi, Shayleen Reynolds, Tucker Balch, and Manuela Veloso. (2023). "FlowMind: Automatic Workflow Generation with LLMs." ICAIF 2023.
Download Paper

Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing

Published in AAAI 2025 (Oral - Top 4% of Submissions), 2024

HalluciBot predicts hallucination risk before LLM generation, rewriting and routing queries to safer reasoning paths.

Recommended citation: William Watson, Nicole Cho, and Nishan Srishankar. (2025). "Is There No Such Thing as a Bad Question? H4R: HalluciBot For Ratiocination, Rewriting, Ranking, and Routing." AAAI 2025.
Download Paper

Financial Table Extraction in Image Documents

Published in ICAIF 2020 (Oral), 2024

Presents a deep learning pipeline to detect, extract, and transcribe financial tables in images while preserving structure.

Recommended citation: William Watson and Bo Liu. (2020). "Financial Table Extraction in Image Documents." ICAIF 2020.
Download Paper

HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies

Published in EMNLP 2023, 2024

Introduces HiddenTables and PyQTax for privacy-preserving TableQA; highlights LLM weaknesses on schema alignment and compositional queries.

Recommended citation: William Watson, Nicole Cho, Tucker Balch, and Manuela Veloso. (2023). "HiddenTables & PyQTax: A Cooperative Game and Dataset For TableQA to Ensure Scale and Data Privacy Across a Myriad of Taxonomies." EMNLP 2023.
Download Paper

FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning

Published in ICAIF 2024, 2024

Designs a modular system combining sub-querying, neural conditioning, and expert swarms to improve financial intelligence tasks.

Recommended citation: Nicole Cho, Nishan Srishankar, Lucas Cecchi, and William Watson. (2024). "FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning." ICAIF 2024.
Download Paper

LAW: Legal Agentic Workflows for Custody and Fund Services Contracts

Published in COLING 2025 (Oral - Top 15%), 2024

LAW orchestrates modular agents and domain-specific tools to automate complex legal contract workflows with high accuracy.

Recommended citation: William Watson, Nicole Cho, Nishan Srishankar, Zhen Zeng, Lucas Cecchi, Daniel Scott, Suchetha Siddagangappa, Rachneet Kaur, Tucker Balch, and Manuela Veloso. (2025). "LAW: Legal Agentic Workflows for Custody and Fund Services Contracts." COLING 2025.
Download Paper

TASER: Table Agents for Schema-guided Extraction and Recommendation

Published in EACL 2026 (Under Review), 2025

Proposes schema-guided table agents for extraction and recommendation in financial and structured data domains.

Recommended citation: Nicole Cho, Kirsty Fielding, William Watson, Sumitra Ganesh, and Manuela Veloso. (2026). "TASER: Table Agents for Schema-guided Extraction and Recommendation." EACL 2026 (Under Review).
Download Paper

No One Size Fits All: QueryBandits for Hallucination Mitigation

Published in ICLR 2026 (Under Review), 2026

Online per-query rewrites adapt on the fly, prove no one-size-fits-all, and cut hallucinations on closed LLMs.

Recommended citation: Nicole Cho, William Watson, Alec Koppel, Sumitra Ganesh, and Manuela Veloso. (2026). "No One Size Fits All: QueryBandits for Hallucination Mitigation." ICLR 2026 (Under Review).
Download Paper