GPTree: LLM-Powered Decision Trees for Explainable Prediction
What is GPTree?
GPTree is a decision tree whose splits are generated and evaluated by a large language model. It was developed at Vela Partners, a quant VC firm, in collaboration with the University of Oxford. The paper introduces GPTree as a framework for explainable decision-making in domains where predictions must be both accurate and auditable. Its primary application is founder success prediction in venture capital.
GPTree combines the transparency of classical decision trees with the reasoning capability of LLMs. Traditional decision trees are inherently interpretable but cannot handle unstructured text or high-dimensional qualitative data. LLMs handle text and reasoning but operate as black boxes. GPTree resolves the trade-off by using an LLM to generate the tree's structure from a single task-specific prompt, without requiring feature engineering or prompt chaining.
How does GPTree predict startup success?
GPTree predicts startup success by building a decision tree in which each internal node is a natural-language question about a founder or company, generated and evaluated by an LLM. To score a new founder, the tree is walked top-down: at each node, the LLM answers the node's question based on the founder's profile, and the sample is routed to the yes-branch or no-branch accordingly. The leaf reached at the end of the walk determines whether the founder is classified as likely to succeed.
Tree construction runs in five stages:
Task Contextualization. The user provides a single task-specific string (for example, “you are a VC analyst evaluating founders for success potential”). This frames all subsequent LLM calls.
Insight Generation. The LLM iterates through the training data in batches of 250 samples, summarizing common traits of successful founders. A second LLM instance synthesizes these batch summaries into a cohesive insight list, which guides question generation.
Question Candidate Generation. Using the insight list and task context, the LLM generates 0-3 candidate questions per input feature. Each question is classified as INFERENCE (answered by the LLM), CODE (answered by a generated Python lambda), or CLUSTERING (answered by category grouping).
Decision Split Optimization. For each candidate question, GPTree computes the weighted Gini impurity of the resulting data split and greedily selects the question that maximizes split homogeneity. This is the same optimization criterion used in classical decision trees.
Expert Refinement. A human expert can edit the trained tree on a validation set by collapsing nodes, rebuilding subtrees, removing trivial branches, or asking targeted questions of the samples at a specific node. This expert-in-the-loop mechanism replaces the feature engineering that classical ML requires.
The backend model in the paper is GPT-4o-mini. Training takes roughly 10 hours and $30 in API usage for 6,000 samples on a single 8-core CPU.
How accurate is GPTree?
GPTree was evaluated on a dataset of 9,892 founder profiles from companies founded between 2010 and 2016 that raised $100K to $4M in seed funding. A founder is classified as successful if their company reaches a $500M valuation via IPO, acquisition, or large funding round. The dataset has a 9.9% success rate.
The cross-validated academic version of GPTree achieves 37.3% precision on this dataset, rising to 40.8% with expert-in-the-loop refinement. When scaled to the real-world US unicorn base rate of 1.9%, this corresponds to 7.8% precision on inception-stage unicorn identification, compared to 3.1% to 5.6% for the best human venture capitalists and 1.9% for random selection.
The best-performing production model in the paper, fine-tuned with input from Vela's VCs, reaches 17.9% precision after scaling. This is 9x the 1.9% market baseline and the original proof that LLM-powered decision trees could cross the 10x precision threshold in quant VC.
GPTree was the foundational paper. The models that came after it, including Random Rule Forest, Reasoned Rule Mining, and Policy Induction, together with the production systems built on top of them, now reach 19% to 38% precision on the same scaled real-world basis. This is a 10x to 20x lift over the US unicorn base rate. The research program at Vela compounded on the architecture that GPTree introduced.
For reference, on the scaled real-world basis used throughout Vela's research:
- Indexing strategy (random selection): 1.9%
- Y Combinator: 3.2%
- Nine Tier-1 venture capital firms: 5.6%
- GPT-4o with few-shot learning: ~5.6%
- GPTree (cross-validated): 7.8%
- GPTree (best in-paper production model): 17.9%
- Current Vela production models (Think-Reason-Learn family): 19% to 38%
GPTree optimizes for F0.5 rather than F1, because in venture capital the cost of funding a bad founder is much higher than the cost of missing a good one.
What makes GPTree explainable?
Every prediction GPTree makes corresponds to a specific root-to-leaf path through the tree. That path is a sequence of natural-language questions and answers that a human can read, verify, and disagree with. There is no post-hoc explanation layer and no attempt to rationalize a black-box output. The explanation is the model.
This matters for quant VC because investment decisions require justification. A partner cannot defend a rejection or an allocation with “the model said so.” GPTree produces justifications that read like an analyst's memo: this founder was classified as low-potential because they did not meet criterion A at the root, criterion B at depth 2, and criterion C at the leaf. A human reviewer can audit every step.
The expert refinement mechanism goes one step further. If an expert disagrees with a specific node's question, they can rewrite it, and the tree changes accordingly. The model is not a fixed artifact to be explained. It is a structured representation of the firm's investment logic that the firm can edit directly.
How does GPTree relate to quant VC more broadly?
Quant VC is the application of quantitative, reproducible methods to venture capital decision-making. Vela Partners is a quant VC firm whose research program is built on the thesis that interpretability and precision are not a trade-off if the underlying architecture is designed correctly.
GPTree is the foundational paper in that program. It was the first to demonstrate, with measurable results, that LLMs could generate the structure of a decision system rather than just its outputs, that human experts could edit that structure directly, and that natural-language reasoning could be made reproducible enough for high-stakes allocation decisions. Those three ideas now run through every subsequent paper at Vela.
Random Rule Forest extends the design from trees to ensembles. Reasoned Rule Mining adds Bayesian calibration and log-odds fusion. Policy Induction moves the reasoning into editable in-context policies. All four papers are implemented as modules inside Think-Reason-Learn, the open-source framework Vela built to generalize the GPTree design pattern beyond venture capital to any domain where experts make high-stakes decisions from qualitative inputs.
GPTree's original domain is quant VC. Its method generalizes to any scientific VC, quantitative finance, or expert-decision context where predictions must be both accurate and auditable.
What are GPTree's limitations?
The paper is explicit about three limitations. CODE-type nodes depend on LLM-generated Python that is not always reliable. INFERENCE nodes are non-deterministic, meaning different phrasings of similar questions can produce different results on the same input. And when information is missing from a founder's profile, the LLM can hallucinate rather than abstain. The limitations section also notes that the dataset reflects founders who voluntarily disclosed information publicly, which introduces selection bias.
These limitations are the explicit motivation for the papers that followed GPTree in Vela's research program.
Read the paper
GPTree: Towards Explainable Decision-Making via LLM-powered Decision Trees.
Sichao Xiong, Yigit Ihlamur, Fuat Alican, Aaron Ontoyin Yin.
arXiv:2411.08257, November 2024.
Full text on arXiv.
GPTree is part of the Think-Reason-Learn research program at Vela, the quant VC firm. Related papers: Random Rule Forest, Reasoned Rule Mining, Policy Induction.
Authored by members of the Vela team. See the full roster of contributors.
For research collaboration, email engage@vela.partners.