Question 1

What is GPTree?

Accepted Answer

GPTree is a decision tree whose splits are generated and evaluated by a large language model. Developed at Vela Partners in collaboration with the University of Oxford, it is a framework for explainable decision-making in domains where predictions must be both accurate and auditable. Its primary application is founder success prediction in venture capital.

Question 2

How does GPTree predict startup success?

Accepted Answer

GPTree builds a decision tree in which each internal node is a natural-language question about a founder or company, generated and evaluated by an LLM. To score a new founder, the tree is walked top-down: at each node the LLM answers the question based on the founder's profile and routes the sample to yes or no. Tree construction runs in five stages: Task Contextualization, Insight Generation, Question Candidate Generation, Decision Split Optimization (weighted Gini impurity), and Expert Refinement. The backend model in the paper is GPT-4o-mini.

Question 3

How accurate is GPTree?

Accepted Answer

The cross-validated academic version achieves 37.3% precision on a 9,892 founder dataset (9.9% base rate), rising to 40.8% with expert-in-the-loop refinement. Scaled to the US unicorn base rate of 1.9%, this corresponds to 7.8% precision on inception-stage unicorn identification. The best production model in the paper reaches 17.9% precision (9x the market baseline). Later Think-Reason-Learn models reach 19% to 38% on the same scaled basis (10x to 20x lift).

Question 4

What makes GPTree explainable?

Accepted Answer

Every prediction corresponds to a specific root-to-leaf path through the tree, which is a sequence of natural-language questions and answers a human can read, verify, and disagree with. There is no post-hoc explanation layer. The explanation is the model. Experts can also edit specific nodes directly, so the tree becomes an editable representation of the firm's investment logic.

Question 5

How does GPTree relate to quant VC more broadly?

Accepted Answer

GPTree is the foundational paper in Vela's quant VC research program. It was the first to demonstrate that LLMs could generate the structure of a decision system, that human experts could edit that structure directly, and that natural-language reasoning could be made reproducible enough for high-stakes allocation decisions. Random Rule Forest, Reasoned Rule Mining, and Policy Induction extend and generalize its design.

Question 6

What are GPTree's limitations?

Accepted Answer

Three explicit limitations: CODE-type nodes depend on LLM-generated Python that is not always reliable; INFERENCE nodes are non-deterministic so different phrasings can produce different results; and when information is missing from a founder's profile, the LLM can hallucinate rather than abstain. The dataset also reflects founders who voluntarily disclosed information publicly, introducing selection bias. These limitations motivated the subsequent Vela papers.

GPTree: LLM-Powered Decision Trees for Explainable Prediction