Amassed Insights #5: AI in Investing

Deploying AI within Hedge Funds is Starting to Make a Real Difference, plus Yipit v. M Science Going to Trial?

A chart of the ROI on implementing Claude Financial Services based on actual timelines from Bridgewater, NBIM and AIG
An Analysis of the Claude Financial Services Implementation Timeline

Increasing Adoption of AI within Investing

As I was curating the most impactful news to include in this edition covering the past 6 weeks of the alternative data industry and reflecting back on my last Alternative Data Breakfast event, it became clear what the theme needed to be: that "Wall Street is moving out of the pilot phase of AI into live deployments" (credit to Matt Robinson, who solely focuses on this topic in his blog, AI Street). As you may notice in the new provider section below, almost every new data vendor we add to our industry-leading directory employs generative AI models to some extent. Almost every data-driven hedge fund I've talked to has been testing and tinkering with generative AI models for at least the past year or two and has begun to find evidence of efficiency gains for narrow use cases (for now). The most common use cases I've seen include automating the mundane pieces of investment research such as synthesizing multiple data sources into a coherent investment memo or generating documentation and code to quickly assess and analyze a new data source. This has led many to expand the scope of their AI exploration and collaborate more closely with the hyperscalers for mutual benefit. Clearly if you're still burying your head in the sand and ignoring integrating LLMs into various aspects of your investment process, you're going to be left behind.

The most obvious examples of the potential for using AI in improving an investor's workflows come directly from some of the largest AI companies. Their playbooks have started to converge: build the best/biggest general-purpose LLMs, partner with best-in-breed data providers and data tools to ground the models in real-world data & their clients' proprietary data, and tune the models and the interface for investors' unique use cases and workflows. Perplexity was first to market with a model purpose-built for the finance space, Perplexity Finance, in which they partnered with Financial Modeling Prep for general financial market data, Unusual Whales for options data, Quartr for live transcripts, and Fiscal.ai for revenue and EPS data:

Anthropic's Claude just kicked the door in with the announcement of their Claude for Financial Services product. Is this AI’s Goldman Sachs moment? You can see more details how the product can be used in the video below, but they've announced significant collaborations with some of the largest asset managers including Norges Bank (NBIM), Bridgewater, DE Shaw, AIG, and Commonwealth Bank. And I came across a detailed implementation guide for Claude Financial Services complete with an ROI analysis and a comparison to the competition. Natively within this product, they've integrated the following data providers:

And finally, OpenAI seems to want to compete with Anthropic in every way possible, and Daloopa announced their data is now integrated into ChatGPT as well.

As you'll see in most of the industry updates highlighted below, AI is driving a lot of the innovation and also causing a lot of headaches with regards to compliance concerns. It's still very unclear what should and shouldn't be allowed to be integrated into an LLM's model weights, particularly regarding copyrighted data available on the open web, so as I've mentioned before, highly regulated firms like hedge funds need to be aware of the risks and potential consequences.

"This paper analyzes the application of Large Language Models (LLMs) in quantitative finance, focusing on:

  • Current state of LLM technology in financial applications
  • Comparative analysis of leading models. Implementation frameworks for production systems.
  • Risk management and quality control considerations.

Key Findings:

  • Multi-model approaches outperform single-model solutions.
  • Production implementation requires robust quality controls.
  • Model selection should be task-specific within the investment process."
  • Age of AI: The latest on artificial intelligence in hedge fund operations by Hedgeweek
    • Q2 Hedge Fund Manager Survey
      • Fund types included in the survey:
        • Mostly North American & European funds.
        • Majority were smaller funds.
        • Wide swath of flagship strategies, with the most common being equity long/short, digital assets and multi-strat.
        • Mostly discretionary, with a skew towards traditional as apposed to tech-savvy styles.
      • Main takeaways:
        • North American funds are most advanced with AI.
        • AI has demonstrated a worthwhile ROI.
        • Half build solely in-house and half use vendors. A common, successful paradigm is customizing a model in house while leveraging third-party LLM models.
        • Main use case is operational efficiency, but can also be used to reduce behavioural bias.
        • Main road blocks are ease of use/integration, clear use cases and affordability.
        • Unsuprisingly, smaller funds are more nimble and able to adopt AI more readily. Larger funds are taking a more targeted, methodical approach.
        • Equity long/short strategies are the least likely to have implemented AI, while managed futures are the most, followed by crypto, macro and multi-strat.
  • Ep. 13 | Artificial Intelligence in the Financial Space by Hedgineer Technologies
    • My Take: This podcast was published about a year ago, but Michael Watson is one of the most enlightened people on the subject of integrating the efficiency gains of generative AI into a hedge fund.
    • Summary: Just about everything in the world of finance is data. It’s only a matter of time that the early adopters and educated users of AI will outrace their competitors to the top.
      • In Episode 13 of the Hedgineer podcast, host Michael Watson interviews Rob Krzyzanowski, an expert in AI and machine learning with experience at notable firms like Avant, Spring Labs, and Citadel - and also a partner at Hedgineer.
      • Rob offers his best-in-class view on the value of AI, and specifically, how it translates into platforms that can produce tangible results for hedge funds. We also get a simplified value proposition perspective for large language models, and how they're fundamentally changing the way funds operate.
      • The conversation also provides a forward-looking view on balancing AI advancements and their application to:
        • risk management
        • stock classification
        • operating efficiencies
        • cost reductions

"These massive dumps have been announced for years, and they are always a recycled pile of credentials with a few new ones sprinkled in"

UBS reports a data leak after a cyber attack on a provider | CNN Business
Swiss banks UBS and Pictet said Wednesday that they had suffered a data leak due to a cyber attack on a provider in Switzerland, which did not compromise client information, although a report said thousands of UBS workers’ data was affected.

Data Being Requested

If this request reasonably matches with a data product you represent or are aware of, please respond.

Alternative Data Sources Generally of Interest to Quant Funds

  • Broad coverage of publicly traded companies (usually over 200-300 tickers in a dataset)
  • Entities cleaned and mapped to identifiers, such as tickers
  • Significant history (preferably 5+ years)
  • Substantial historical data included in the trial
  • Minimal lag and frequent updates
  • Point-in-time data
  • Reliable delivery infrastructure
  • Data Categories of Interest (non-exhaustive): News, Financial Market, Jobs, Transactional, Geospatial/Location, Unstructured (Entities & Open Text), Events, Risk, Government

If you'd like us to source data that fulfills your unique requirements:

Data Providers & Products

If any of the following data providers piques your interest for any reason, respond and I'll share additional materials & directly introduce you, if necessary.

New Data Providers

Partnerships + New or Updated Data Products

  • Carbon Arc's updated Platform 2.0 automatically generating insights from alternative datasets by leveraging a proprietary knowledge graph and generative AI:

M&A

"The enhanced platform will seamlessly integrate AI into all aspects of its user journey, including profiles, robust search capabilities, smart screeners and watchlists, recommended trade ideas, and stream summaries. The AI will wrap Stocktwits’ proprietary data to deliver tangible, actionable insights that cut through market noise and surface personalized ideas and analytics for each user exactly when they need them. Based on this new foundation, Stocktwits aims to redefine the future of investing by offering AI-powered, personalized agents that deliver timely insights to investors of all types and skill levels...
Stocktwits will release its new AI tools beginning in Q4 2025, which will also include an institutional-quality index builder with backtesting capabilities, democratizing access to portfolio-building tools that were previously only available to institutional investors."

Funding

Recent News, Blogs & Podcasts

2,376 Alternative Data News, Blogs, Podcasts & Video Feeds
A comprehensive database of the RSS feeds we’re following across the data & investing industries that collectively publish 130,000+ stories per month

"Seven-Step Framework for Tracking Tariff Impacts

  1. Monitor Tariffs and Exposure (Part I)
    Track policy announcements and effective dates using official government data and understand corporate exposure.
  2. Supply Chain Response (Part I)
    Use AIS shipping data, bill-of-lading records, and port analytics to detect sourcing shifts and import behavior.
  3. Logistics Tightness (Part I)
    Monitor freight volumes, rail activity, and manufacturing job postings for signs of stress or adaptation.
  4. Wholesale/Distributor Signals (Part II)
    Analyze inventory levels, delivery timelines, and SKU4 availability from online sources and B2B platforms.
  5. Retailer Adjustment (Part II)
    Track real-time price changes, discounting patterns, and SKU churn to identify margin pressure and cost pass-through.
  6. Consumer Reaction (Part II)
    Use transaction data, price sensitivity models, and substitution patterns to assess spending behavior.
  7. Company Results as a Signpost (Part II)
    Apply NLP5 to earnings calls and investor communications to surface tariff exposure, strategic shifts, and risk language."
  • Hedge fund Millennium valued at $14bn in minority stake sale talks by Financial Times
    • Izzy Englander’s group working with Petershill Partners as it opens up to external investors for the first time.
  • Two-Minutes Ahead of the Future by Jason DeRise
    • An odd night in a Boston hotel bar watching the Knicks-Pacers game becomes a parable about data, confidence, and the danger of being early.
    • My Take: I will consume any content that mixes my beloved Knicks and alternative data...this might be the only article I've seen do that successfully.
  • The Quantbot Episode by The Alternative Data Podcast
    • A conversation with Paul White, CEO and co-founder of Quantbot, a quantitative hedge fund, talking about the challenges of launching a new hedge fund in 2009 and how they would be different now, and where he sees the opportunities and risks in today’s environment.
Data Co-ops as an Alternative to the Centralized Digital Economy - Project Liberty
Project Liberty Institute launched a new report: “Laying the Groundwork for a Scalable Alternative to the Centralized Digital Economy.”

17 Upcoming Events, including:

6 Recent Events of Interest, including:

  • WatersTechnology's Waters Rankings 2025 happened on Jul 11, 2025.
  • Amass Insights & Lowenstein Sandler's Alternative Data Breakfast Series #2: The Impact of AI on Institutional Investing happened on Jun 10, 2025 in New York.
    • My expert panelists and I tackled maybe the most timely and buzzy topic in the asset management industry: how AI is transforming investing, particularly in the front office.
    • Along with Lowenstein Sandler LLP & Boris Liberman, I co-hosted the second installment of our Alternative Data Breakfast Series where we explored the past, present and potential future of using AI within hedge funds & alternative data. Some takeaways from each of our panelists:
    • Richard Rothenberg was the only academic on the panel (as well as being a practitioner founding Global AI) and educated the crowd about the origins of AI going back several decades, when it was usually referred to as machine learning or natural language processing. Him and I had a friendly disagreement on the potential for using synthetic data in augmenting smaller sample sizes in the future.
    • Michael Watson described how his company Hedgineer rapidly spins up full data infrastructures for new or emerging hedge funds including really clean, reconciled data linking together a security master, entity master and position master. He championed the revolutionary, newly-released Claude Code product and its countless use cases, such as automatically building comprehensive data documentation from previously unknown alternative datasets.
    • Evan Reich delved into how he leverages AI to increase efficiency when "buying all the things" for Verition Fund Management LLC. He leverages LLMs to automate some of the steps in repeatable data sourcing workflows.
    • Boris Liberman has been working with both asset managers and data providers to update their licenses and policies to reflect the new concerns that arise from employing generative AI.
    • And I explained how I've built a digital personal/executive assistant that automates and enhances labor-intensive parts of my daily work. Some tools and use cases I mentioned: Howie for scheduling, Zapier for linking together everything including meeting prep, Granola for taking meeting notes, and ChatGPT for content editing.
    • Thanks to everyone who joined and special thanks to my panelists! Let me know if you'd like the invite to the next one.
  • BattleFin's BattleFin Discovery Day New York 2025 happened on Jun 10, 2025 in New York.
    • Top Questions: BattleFin Discovery Day New York, 2025 by Jason DeRise
    • Technological innovations like generative AI & public policy shifts like Trump #47 were key points in our fireside chat about the latest guidance and best practices for alternative data onboarding & due diligence. At BattleFin last month, Boris Liberman, George Danenhauer and I discussed a number of topics that should be on the radar of the legal and compliance officers at both alternative data providers and buyers, such as:
      • recent federal de-regulatory efforts leading to a continuation of state data laws, most recently Delaware and New Jersey, and the first state AI legislation in Texas
      • best practices for hedge funds when integrating AI into their investment processes, including recordkeeping, explainability, AI-specific policies & procedures, AI-specific due diligence on vendors, and AI-specific clauses within contracts with vendors
      • best practices for data providers when using AI in their data products, including diligencing your internal AI systems, adding additional details to your DDQs relating to your usage of AI, and updating your data license agreements to reflect IP rights and protections in the age of AI
  • Amass Insights's Alternative Data Happy Hour in NYC #27 happened on Jun 3, 2025 in New York.
    • My happy hours just seem to keep growing, with 110 attendees this time!