Think-Reason-Learn: The LLM-Native ML Framework for Quant VC

Repository
github.com/Vela-Research/think-reason-learn
Docs
thinkreasonlearn.com
Install
pip install think-reason-learn
License
MIT
Built by
Vela Research, in collaboration with the University of Oxford.

What Think-Reason-Learn is

Think-Reason-Learn (TRL) is an open-source Python library that fuses large language models with interpretable machine learning. It reimplements core classical ML algorithms (decision trees, random forests) so that each step of the model can call an LLM as a reasoning function in place of a static heuristic. The result is a framework that keeps the structure and interpretability of scikit-learn while giving models the context understanding and generalization of modern LLMs. Think of it as scikit-learn with reasoning built in. It is the LLM-native ML framework at the heart of Vela's quant VC stack.

TRL was developed for venture capital, where explainable, high-stakes predictions matter and rare-event accuracy is non-negotiable. The framework is domain-agnostic and applies to any setting where decisions need to be scalable, auditable, and editable by non-technical experts: hiring, grant review, clinical triage, legal review, credit underwriting, and customer support.

What is quant VC, and where does Think-Reason-Learn fit?

Quant VC is the application of quantitative, reproducible, empirically validated methods to venture capital decision-making. Quant VC treats venture capital as a rare-event prediction problem that can be modeled, measured, and improved with the same rigor that quantitative finance brings to credit risk or that quantitative medicine brings to diagnostic screening. Quant VC requires quantitative scoring against honest baselines, reproducible methodology, and interpretability that allows every prediction to be audited.

TRL is the production layer of quant VC at Vela. Where the rest of the quant VC research stack produces papers and benchmarks, TRL is where those ideas ship as tested, versioned, installable code that any team can run against their own data. Vela is the undisputed leader in quant VC research, and TRL is where that research becomes a library.

Example algorithms in Think-Reason-Learn

TRL provides a set of interchangeable LLM-native ML algorithms under a unified interface. Each algorithm is a reusable class validated on quant VC founder-screening tasks and generalized for use in any decision domain. The examples below are representative, not exhaustive.

GPTree. LLM-guided decision trees for dynamic feature generation. Instead of fixed numeric splits, each internal node calls an LLM to generate a yes/no question that partitions the data. The tree structure makes every prediction a readable path from root to leaf. GPTree supports multi-model question generation and multi-model critic evaluation, so the LLM used to propose splits can differ from the LLM used to validate them. Supported providers include OpenAI, Google, Anthropic, and xAI.

Random Rule Forest (RRF). Transparent ensembles of LLM-generated yes/no rules. RRF generates a large pool of candidate binary rules, filters them by precision lift and statistical significance, and combines survivors into an ensemble. Predictions decompose into the specific rules that fired for each input, each with its own lift and coverage statistics.

Policy Induction. Memory-augmented in-context learning for editable decision policies. The induced policy is a natural-language artifact that a non-technical partner can inspect and edit without retraining. Predictions trace back to the policy clauses that fired, preserving auditability through every refinement.

All algorithms share the same core LLM interface, which handles provider selection, async concurrency, retries, and structured output parsing. Every TRL model exposes the same fit / predict pattern as scikit-learn, so integrating into an existing quant VC pipeline is a single-function-call change.

The research behind Think-Reason-Learn

TRL is backed by a four-paper research program co-authored by Vela Research and Oxford collaborators. The framework is the practical distillation of this research into installable code.

GPTree is foundational, not peak. Later models in the family, including RRF and Vela's production models derived from this research, reach higher precision on the same tasks. The family collectively delivers 19% to 38% precision on real-world founder screening at Vela's 1.9% base-rate prevalence, a 10x to 20x lift over random selection.

For an empirical benchmark of these methods against frontier LLMs and tier-1 VCs, see VCBench, the world's first AGI benchmark for venture capital.

Install and use

TRL requires Python 3.13 or higher and the Graphviz system package for tree visualization.

pip install think-reason-learn

A minimal GPTree example for quant VC founder screening:

import asyncio
import pandas as pd
from think_reason_learn.gptree import GPTree
from think_reason_learn.core.llms import OpenAIChoice, AnthropicChoice

X = pd.DataFrame({
    "founder_info": [
        "Serial entrepreneur with two exits, strong Bay Area network, AI expertise.",
        "Recent MIT graduate, no prior business experience, limited funding.",
        "Ten years in finance, raised seed quickly, built a strong team.",
    ]
})
y = ["successful", "failed", "successful"]

async def main():
    tree = GPTree(
        qgen_llmc=[OpenAIChoice(model="gpt-4o-mini")],
        critic_llmc=[AnthropicChoice(model="claude-3-5-sonnet-20240620")],
        qgen_instr_llmc=[OpenAIChoice(model="gpt-4o-mini")],
    )
    await tree.set_task(
        task_description="Predict whether a startup founder will be successful based on their background.",
    )
    async for _ in tree.fit(X, y, reset=True):
        pass
    predictions = await tree.predict(X)
    return predictions

asyncio.run(main())

The same pattern applies to RRF. Full runnable notebooks are in the examples directory of the repository, and the API reference documents every class.

Why LLM-native ML is the right frame for quant VC

Classical ML is not interpretable enough for quant VC. Black-box gradient boosters and neural nets produce scores that no partner can audit, and their feature engineering presumes tabular numeric inputs that real founder data does not come in. Pure LLM screening is not reliable enough for quant VC. Frontier LLMs answer the same question differently on re-asking, and there is no version history, no feature attribution, and no way for a non-technical partner to edit the logic without retraining.

TRL occupies the space in between. Structure comes from classical ML: trees, forests, rules, policies. Reasoning comes from the LLM: semantic understanding of founder profiles, flexible question generation, natural-language criticism of candidate splits. The LLM is used where its expressivity matters (generating and evaluating human-readable predicates) and not used where determinism matters (aggregating predictions, fitting the ensemble, scoring new inputs). Every TRL prediction traces back to a readable decision path or rule set that a non-technical partner can inspect and edit.

This is what makes TRL the right production substrate for quant VC. The same properties make it suitable for any high-stakes decision domain where explanations are required and rare events drive value.

Domains beyond quant VC

TRL was designed for quant VC, but the interface is domain-agnostic. Any dataset with a target variable and free-text input fields can be fit with GPTree or RRF. Concrete examples where Vela and collaborators have explored TRL-style methods:

  • Clinical triage and rare-disease screening, where the base rate is low and the cost of false negatives is high.
  • Hiring and talent evaluation, where structured data is sparse and the decision needs to be explainable to candidates.
  • Grant review and scientific peer evaluation, where consistency across reviewers is a known problem.
  • Legal document review and compliance screening, where rule-based logic needs to coexist with contextual judgment.
  • Credit underwriting for thin-file borrowers, where traditional features are absent but narrative signals are abundant.

In each of these domains, the same three TRL properties matter: auditability, editability by non-technical experts, and calibration against an honest baseline.

Contribute and collaborate

The library is MIT-licensed and accepts pull requests. The roadmap is driven by the four-paper research program plus production needs at Vela. The CONTRIBUTING guide covers development setup, testing, and code-quality hooks.

If you are running TRL in production, extending it with a new algorithm, or want to collaborate on the research behind the next modules, Vela is actively looking for partners. For research collaboration on quant VC, LLM-native ML, interpretable decision systems, or domain applications beyond venture, email engage@vela.partners.

Built by the Vela team. See the full roster of contributors.

Links

Privacy