GPT-HTree: Hierarchical Clustering and LLM Personas for Quant VC

Paper
GPT-HTree: A Decision Tree Framework Integrating Hierarchical Clustering and Large Language Models for Explainable Classification.
Authors
Te Pei (University of Oxford), Fuat Alican, Aaron Ontoyin Yin, Yigit Ihlamur (Vela Research).
Venue
arXiv preprint, January 2025.
Status
Preprint. arXiv:2501.13743.
Research program
Part of the LLM-Augmented ML line of Vela's quant VC research, which embeds LLMs as components inside classical ML pipelines rather than as standalone reasoners.

What GPT-HTree contributes to quant VC

GPT-HTree addresses a specific failure mode in quant VC: a single global decision tree has to find common success patterns across fundamentally different types of founders, and ends up oversimplifying. A serial-exit entrepreneur and an academic researcher have different predictors of success, so treating them uniformly loses signal. GPT-HTree fixes this by first segmenting founders into personas through hierarchical clustering, then fitting a localized decision tree within each persona, and finally using an LLM (GPT-4) to generate human-readable descriptions of each persona from the cluster's z-score feature profile.

On Vela's 8,800-founder dataset with 64 features, GPT-HTree surfaces eight founder personas with success rates ranging from 0.8 percent (Early Professionals) to 17.4 percent (Serial Exit Founders). The best cluster is 9x the 1.9 percent real-world baseline, and the spread between the top and bottom cluster is 22x. For quant VC, this is the difference between a screening model that flags every founder equally and one that tells the deal team exactly which sourcing channels yield the highest conditional probability of success.

What is quant VC, and where does GPT-HTree fit?

Quant VC is the application of quantitative, reproducible, empirically validated methods to venture capital decision-making. Quant VC treats venture capital as a rare-event prediction problem that can be modeled, measured, and improved with the same rigor that quantitative finance brings to credit risk or quantitative medicine brings to diagnostic screening. Quant VC requires quantitative scoring against honest baselines, reproducible methodology, and interpretability that allows every prediction to be audited.

GPT-HTree sits in the LLM-Augmented ML strand of Vela's quant VC research, alongside LLM-AR (neural-symbolic reasoning with ProbLog), rare-event prediction with LLM feature engineering, and verifiable reasoning (LLMs as code generators). This strand wraps LLMs inside classical ML pipelines rather than making the LLM the decision-maker. The LLM's job in GPT-HTree is persona description, not prediction. Prediction happens inside standard hierarchical clustering and decision trees, which means the quantitative outputs are as reproducible as any traditional ML pipeline.

How does GPT-HTree evaluate founders?

GPT-HTree has four main stages: class-balanced resampling, hierarchical clustering, per-cluster decision trees, and LLM persona generation.

Resampling with CTGAN. Startup success is rare. A decision tree trained on the raw 1.9 percent real-world base rate collapses to predicting universal failure. GPT-HTree uses Conditional Tabular Generative Adversarial Networks (CTGAN) to synthesize additional examples of successful founders, producing realistic rather than simply duplicated positives. After resampling, the top-VC-experience cluster separates into subclusters with success rates spanning 3.9 percent to 50.0 percent, where the pre-resampling spread was only 4.9 to 19.75 percent. Resampling is what makes the downstream decision trees discriminative.

Hierarchical clustering. On the resampled data, agglomerative hierarchical clustering produces eight main founder personas, each defined by a distinctive combination of features. The clustering does not require a predefined number of clusters or labeled outcomes.

Per-cluster decision trees. Within each persona, a separate decision tree is trained to classify successful versus unsuccessful founders, using Gini impurity as the split criterion. Feature importance is computed per cluster, which means the most predictive features differ across personas. For the VC Experience cluster, the top three features are previous startup funding experience as CEO (0.655), worked at consultancy (0.177), and big tech position (0.167). A global decision tree cannot surface persona-specific feature importance like this.

LLM persona descriptions. For each cluster, the paper computes per-feature z-scores against the global founder distribution, selects the features with the largest deviations, and hands them to GPT-4 with a structured prompt. The LLM outputs a persona summary, distinguishing traits, success factors, risk factors, and recommendations. An example: a cluster with high tier-1 VC experience, high angel experience, and strong prior-startup investor quality is described as “Elite entrepreneurial founders with VC-backed success.”

How accurate is GPT-HTree?

GPT-HTree was evaluated on 8,800 founders with 64 features. Success follows the standard Vela convention (unicorn status or significant exit milestones, corresponding to the 1.9 percent baseline). The paper reports success rates per persona, calibrated to the real-world base rate rather than to an inflated test prevalence:

  • Serial Exit Founders: 17.4 percent success rate (9.2x the 1.9 percent baseline).
  • IPO Experience Founders: 16.3 percent (8.6x).
  • Venture Capitalists: 13.1 percent (6.9x).
  • Tech Leaders: 3.7 percent (1.9x).
  • Industry Influencers: 2.7 percent (1.4x).
  • Professional Athletes: 1.9 percent (at baseline).
  • Career Professionals: 1.4 percent (below baseline).
  • Early Professionals: 0.8 percent (below baseline).

The headline numbers: the top cluster is approximately 9x the baseline, and the top-vs-bottom spread is 22x. Within the VC Experience main cluster, resampling additionally separated subclusters with success rates ranging from 3.9 percent to 50.0 percent, a granularity that a global model cannot produce.

Vela's full production quant VC stack, across the Think-Reason-Learn family and related research, reaches 19 to 38 percent precision when scaled to the 1.9 percent real-world base rate, a 10x to 20x lift over the index. GPT-HTree's 9x cluster-level lift sits inside that program as a complementary segmentation layer: it tells the deal team which founder archetypes to prioritize, while the Think-Reason-Learn reasoning models tell them which individual founders within a given archetype are most likely to succeed.

Why cluster-level personas matter for quant VC

A global decision tree fits one decision rule across the entire founder population. GPT-HTree rejects that premise and asks a different question: what decision rule applies to this specific type of founder? The answer is not uniform. Serial exit founders succeed because of their acquisition track record and network density. VC-experience founders succeed because of tier-1 backing and deal flow access. Big-tech-experience founders succeed because of execution credibility and technical depth. Collapsing these onto a single decision path loses the signal each persona actually provides.

For quant VC deployment, this matters because different sourcing channels yield different founder distributions. A channel that produces serial exit founders has a 17.4 percent conditional success rate. A channel that produces early professionals has a 0.8 percent conditional success rate. Knowing which persona each channel yields is directly actionable at the portfolio-construction level.

What makes GPT-HTree auditable for quant VC decisions

Every GPT-HTree prediction decomposes into three inspectable artifacts: a persona assignment (which cluster the founder falls into, computed from centroid distance), a decision path (which rule in that cluster's tree classified them), and an LLM-generated description of the persona in plain language. A partner who disagrees with the classification can look at which persona was assigned and why, trace the decision tree split that led to the prediction, and read the LLM description to check whether the persona label matches their own intuition. If any of those three fails inspection, the partner can override the decision. The reasoning is never hidden in an LLM's weights.

How GPT-HTree fits into Vela's quant VC research program

GPT-HTree connects to Vela's broader quant VC research along three axes:

  • Same family: GPT-HTree, LLM-AR, rare-event prediction with LLM feature engineering, and verifiable reasoning all treat the LLM as a component inside a classical ML pipeline rather than as the final decision-maker.
  • Adjacent via method: GPT-HTree extends GPTree (Xiong et al., 2024) from the Think-Reason-Learn family. GPTree uses an LLM to build a single global decision tree. GPT-HTree adds hierarchical clustering so that different founder personas get different trees. Related TRL work includes Random Rule Forest, Reasoned Rule Mining, and Policy Induction.
  • Adjacent via pattern: The multi-agent Founder-GPT and SSFF papers orchestrate LLMs as reasoning agents. GPT-HTree uses the LLM only for persona interpretation.
  • Benchmarking: Descendants of GPT-HTree and related methods are evaluated on VCBench, the public benchmark for quant VC.

Limitations

The paper is explicit about what GPT-HTree does not yet resolve. False positives cluster around surface-level signals such as media presence without substantive achievements, inflated valuation histories, and shallow industry connections. False negatives tend to concentrate on domain experts in emerging sectors and first-time founders whose profiles do not match traditional success patterns. The model inherits biases from historical data, including the possibility of LinkedIn profile omissions or exaggerations, and may reinforce conventional success metrics at the expense of newer patterns. Market timing is not incorporated as a feature. The reported results were generated with GPT-4o, and newer reasoning models have not yet been evaluated. LLM hallucination during feature engineering remains a data-quality concern.

Read the paper

GPT-HTree: A Decision Tree Framework Integrating Hierarchical Clustering and Large Language Models for Explainable Classification.
Te Pei, Fuat Alican, Aaron Ontoyin Yin, Yigit Ihlamur.
arXiv preprint, January 2025.
arXiv:2501.13743.

GPT-HTree is part of the LLM-Augmented ML family of Vela's quant VC research. For related work within the same family, see LLM-AR, rare-event prediction, and verifiable reasoning. For the adjacent rule-induction line, see GPTree (the direct precursor), Random Rule Forest, Reasoned Rule Mining, and Policy Induction, all part of the Think-Reason-Learn family. For the multi-agent line, see Founder-GPT and SSFF. For the benchmark that measures progress across all three families, see VCBench.

Authored by members of the Vela team. See the full roster of contributors.

For research collaboration in quant VC, explainable ML for founder evaluation, and persona-based startup screening, email engage@vela.partners.

Privacy