Portfolio Outliers: Latent-Factor Modeling for Quant VC Portfolio Construction
What Portfolio Outliers contributes to quant VC
The worst thing that can happen to a VC fund is zero unicorns in the portfolio. Quant VC has spent the last two years getting progressively better at predicting which individual founders are likely to produce unicorns, but screening precision is not the same thing as fund-level reliability. Portfolio Outliers takes that observation seriously and builds a probabilistic model that computes P(U = 0), P(U ≤ 1), and P(U ≤ 2) for any portfolio composition, accounting for the fact that deals are not independent. Companies that share sectors, geographies, or founder types are exposed to common economic and technological shocks, and that dependence meaningfully changes the probability of joint success and joint failure.
The central quant VC claim of the paper: expected unicorn counts alone are insufficient statistics for evaluating a venture portfolio. For a 40-deal portfolio of identical 4% deals, the probability of returning zero unicorns rises from 19.6% under independence to 32.5% under the paper's full multi-factor correlation model. Same expected outcome, fundamentally different fund-level risk.
What is quant VC, and where does Portfolio Outliers fit?
Quant VC is the application of quantitative, reproducible, empirically validated methods to venture capital decision-making. Quant VC treats venture capital as a rare-event prediction problem that can be modeled, measured, and improved with the same rigor that quantitative finance brings to credit risk or that quantitative medicine brings to diagnostic screening. Quant VC requires quantitative scoring against honest baselines, reproducible methodology, and interpretability that allows every prediction to be audited.
Portfolio Outliers sits in Vela's Portfolio Construction research thread, which is distinct from the LLM-based deal-screening work in Think-Reason-Learn and LLM-Augmented ML. Where those threads produce deal-level success probabilities, Portfolio Outliers consumes them: given a set of deal-level probabilities and the group affiliations of each deal, it computes the full distribution of unicorn counts at the fund level. This is the quant VC counterpart to credit portfolio risk modeling in banking, and the paper adapts frameworks from Vasicek (1987), Koyluoglu and Hickman (1998), and Gordy (2003) to the venture setting. Co-author Hasan Ugur Koyluoglu brings the original credit-risk lineage directly into the paper.
How does the latent-factor model work?
Each investment i is modeled as a latent Gaussian return index A_i, standard normal by construction. The company becomes a unicorn if A_i exceeds a threshold calibrated to the deal's standalone success probability p_i. The latent return is decomposed into common-factor exposures and an idiosyncratic shock, which preserves heterogeneous success probabilities at the deal level while introducing interpretable dependence at the portfolio level.
Factor loadings. Each company belongs to three group types: sector, geography, and founder type. The paper weights these exposures at S = 0.6, G = 0.3, and F = 0.1, reflecting the view that sectoral exposure dominates shared risk, followed by geography and founder type. These values are explicit modeling priors, not calibrated estimates, and can be stress-tested by varying them.
Correlation matrix from public equity proxies. Private venture outcome data is not public, so the paper estimates the correlation matrix Σ from monthly log returns of sector ETFs and baskets of publicly traded firms over January 2020 to December 2025. The implementation covers 11 groups: five sectors (AI, FinTech, healthcare, consumer, SaaS), four geographies (California, New York, Massachusetts, other US), and two founder types (first-time, repeat). Healthcare shows notably weak correlation with other groups, suggesting distinct economic drivers. AI and California show strong correlation, consistent with the geographic concentration of AI firms. First-time and repeat founders correlate strongly, since both are exposed to similar market dynamics.
Average pairwise correlation target. The overall scaling parameter w_0 is calibrated so that the average pairwise correlation across a representative universe of deals equals 0.12, a magnitude imported from standard credit risk modeling practice.
Simulation. Portfolio outcomes are generated via Monte Carlo with Cholesky decomposition of Σ, producing distributions of unicorn counts for arbitrary portfolios.
What does the model reveal about quant VC portfolios?
The paper stress-tests seven synthetic portfolios of 40 deals each, with deal-level probabilities drawn from Beta distributions calibrated to empirical first-time (mean 1.8%) and repeat (mean 2.6%) founder base rates. The findings reshape several common quant VC intuitions.
Diversification reduces downside risk, at the cost of conditional upside. Portfolio B (concentrated: 100% AI, California, repeat founders) has P(U = 0) of 28.6% with E[U | U ≥ 1] of 2.58. Portfolio D (selective but diversified across sectors and geographies) has P(U = 0) of 28.0% with E[U | U ≥ 1] of 2.57. Nearly identical downside, nearly identical conditional upside at the first threshold, but the spread widens at higher thresholds: E[U | U ≥ 3] is 4.46 for B versus 4.42 for D. This formalizes the tradeoff between reliability and magnitude. More concentrated portfolios realize larger clusters of success when they hit. More diversified portfolios fail less often but hit smaller.
Equal-weight diversification is not automatically optimal. Portfolios E (equal-weight across four sectors), F (AI-heavy), and G (healthcare-heavy) have similar sector diversity metrics. Portfolio G shows the lowest P(U = 0) of the three (27.38% versus 27.45% and 27.53%), because healthcare is weakly correlated with the other sectors. Tilting toward weakly correlated groups can outperform equal-weight diversification even when the tilted group has lower standalone probabilities. Effective quant VC diversification requires knowing the correlation structure, not just counting buckets.
Correlation structurally limits the benefit of better deal-level screening. Doubling the per-deal success probability from 2% to 4% drops P(U = 0) from 44.5% to 19.6% under independence. Under the full correlation model, the same doubling drops P(U = 0) only from 54.6% to 33.4%. Precision improvements yield progressively smaller fund-level risk reductions as correlation increases. For a quant VC firm, this means deal-level precision and portfolio construction are complementary, not substitutes.
Portfolio size amplifies expected outcomes more than it reduces risk. Holding composition fixed, scaling from 5 to 40 investments grows E[U] approximately linearly, but reduces P(U = 0) with diminishing returns. Even large quant VC portfolios remain vulnerable to common shocks that can produce joint failure across deals. Size is not a substitute for dependence-aware construction.
Why correlation-aware portfolio construction matters for quant VC
A fund's risk profile is not just the aggregation of its deal-level probabilities. Two portfolios with the same expected unicorn count can have materially different probabilities of producing zero unicorns, and that gap is driven entirely by the correlation structure of the underlying deals.
This matters for LP-facing quant VC communication. A fund that says “we expect 1.6 unicorns” is describing a mean. A fund that says “we expect 1.6 unicorns with P(zero unicorns) under 30%” is describing both the mean and the fund-level tail risk. Portfolio Outliers is the machinery that turns the second statement into something auditable.
What makes Portfolio Outliers auditable for quant VC decisions
Every output of the model traces back to three interpretable inputs: deal-level success probabilities, the group affiliations of each deal, and the empirically estimated correlation matrix Σ. A quant VC partner can inspect Σ directly (the paper visualizes it), change the factor weights S, G, F, or substitute in a different correlation estimate, and rerun the Monte Carlo to see how P(U = 0) moves. Nothing is buried in an attention map or a neural net. The entire risk calculation is a Cholesky decomposition and a set of threshold exceedances, both standard and inspectable.
How Portfolio Outliers fits into Vela's quant VC research program
Portfolio Outliers is a foundational paper in Vela's Portfolio Construction research thread. It complements the deal-screening work in two other threads:
- The Think-Reason-Learn family (GPTree, Random Rule Forest, Reasoned Rule Mining, Policy Induction) produces interpretable deal-level success probabilities using LLM-native ML.
- The LLM-Augmented ML thread (LLM-AR, GPT-HTree, Rare-event prediction, Verifiable Reasoning, Learning What to Ask) uses LLMs as components inside classical ML screening pipelines.
Portfolio Outliers takes the output of those pipelines as input and answers the fund-level question: given these deal-level probabilities and their group affiliations, what is the distribution of unicorns in the portfolio? The VCBench benchmark covers the deal-selection side of the stack; Portfolio Outliers covers the composition side. A companion paper with Hasan Ugur Koyluoglu as co-author, “Deal-Level Correlation and Portfolio Construction,” extends the correlation modeling in related directions.
Limitations
The paper is explicit about several limitations. The model assumes each company belongs to a single sector, whereas many real companies span multiple sectors; allowing multi-sector affiliations would be more realistic but increases complexity. The correlation matrix is estimated from public equity proxies rather than private market valuation data, which the authors acknowledge as an imperfect substitute. Correlations are assumed static over time, whereas venture markets are cyclical with alternating systemic and idiosyncratic periods. The latent factors are assumed Gaussian, which may underestimate the probability of joint extreme clustering given the heavy tails of venture outcomes; extensions to Student-t or other heavy-tailed distributions are a natural next step. The factor weights S = 0.6, G = 0.3, F = 0.1 are modeling priors, not calibrated estimates, and the analysis largely abstracts from interactions between sector, geography, and founder type at the individual deal level.
Read the paper
Probabilistic Modeling of Venture Capital Portfolio Outliers.
Kensei Sakamoto, Hasan Ugur Koyluoglu, Fuat Alican, Yigit Ihlamur.
arXiv preprint arXiv:2602.07761, February 2026.
Read on arXiv.
Portfolio Outliers is the foundational paper in Vela's quant VC portfolio construction research. For the complementary deal-screening research program, see Think-Reason-Learn and the LLM-Augmented ML thread (LLM-AR, GPT-HTree, Rare-event prediction, Verifiable Reasoning, Learning What to Ask). For empirical benchmarking of Vela's screening models, see VCBench.
Authored by members of the Vela team. See the full roster of contributors.
For research collaboration on quant VC portfolio construction, correlation modeling, LP-facing fund risk analysis, or credit-risk-inspired methods for venture, email engage@vela.partners.