<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://phzwart.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://phzwart.github.io/" rel="alternate" type="text/html" /><updated>2026-02-02T22:41:37-08:00</updated><id>https://phzwart.github.io/feed.xml</id><title type="html">Peter H. Zwart</title><subtitle>Computational Biophysicist &amp; Machine Learning Researcher</subtitle><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><entry><title type="html">Future Blog Post</title><link href="https://phzwart.github.io/posts/2012/08/blog-post-4/" rel="alternate" type="text/html" title="Future Blog Post" /><published>2199-01-01T00:00:00-08:00</published><updated>2199-01-01T00:00:00-08:00</updated><id>https://phzwart.github.io/posts/2012/08/future-post</id><content type="html" xml:base="https://phzwart.github.io/posts/2012/08/blog-post-4/"><![CDATA[<p>This post will show up by default. To disable scheduling of future posts, edit <code class="language-plaintext highlighter-rouge">config.yml</code> and set <code class="language-plaintext highlighter-rouge">future: false</code>.</p>]]></content><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><category term="cool posts" /><category term="category1" /><category term="category2" /><summary type="html"><![CDATA[This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.]]></summary></entry><entry><title type="html">Kumaraswamy Distributions for Conformal Prediction</title><link href="https://phzwart.github.io/posts/2025/02/kumaraswamy-conformal/" rel="alternate" type="text/html" title="Kumaraswamy Distributions for Conformal Prediction" /><published>2025-02-03T00:00:00-08:00</published><updated>2025-02-03T00:00:00-08:00</updated><id>https://phzwart.github.io/posts/2025/02/conformal-kumaraswamy</id><content type="html" xml:base="https://phzwart.github.io/posts/2025/02/kumaraswamy-conformal/"><![CDATA[<p>Exploring the analytical advantages of Kumaraswamy mixture models for conformal prediction with closed-form region statistics.</p>

<h2 id="introduction">Introduction</h2>

<p>When working with conformal prediction for binary classification, we often need to compute region masses and posteriors. The Kumaraswamy distribution offers a compelling alternative to Beta distributions because of its <strong>closed-form CDF</strong>, eliminating the need for incomplete beta functions.</p>

<h2 id="the-kumaraswamy-distribution">The Kumaraswamy Distribution</h2>

<p>The Kumaraswamy distribution on $[0,1]$ has PDF and CDF:</p>

\[f(x; a, b) = abx^{a-1}(1-x^a)^{b-1}\]

\[F(x; a, b) = 1 - (1-x^a)^b\]

<p>The closed-form CDF is the key advantage! For Beta distributions, we’d need:</p>

\[F_{\text{Beta}}(x; \alpha, \beta) = I_x(\alpha, \beta) \quad \text{(incomplete beta function)}\]

<h2 id="conformal-prediction-setup">Conformal Prediction Setup</h2>

<p>Consider a binary classification problem where we use label-conditional conformal prediction. We have:</p>

<ul>
  <li>
    <table>
      <tbody>
        <tr>
          <td>Class-conditional distributions: $f(x</td>
          <td>Y=0) \sim \text{Kumaraswamy}(a_0, b_0)$ and $f(x</td>
          <td>Y=1) \sim \text{Kumaraswamy}(a_1, b_1)$</td>
        </tr>
      </tbody>
    </table>
  </li>
  <li>Prior: $c = P(Y=0)$</li>
  <li>Miscoverage levels: $\alpha_0, \alpha_1$</li>
</ul>

<p>The conformal cutpoints are computed using the quantiles:</p>

\[L = F_0^{-1}(\alpha_1) = \left(1-(1-\alpha_1)^{1/b_0}\right)^{1/a_0}\]

\[U = F_1^{-1}(1-\alpha_0) = \left(1-\alpha_0^{1/b_1}\right)^{1/a_1}\]

<h2 id="region-statistics">Region Statistics</h2>

<p>We partition $[0,1]$ into three regions based on $r_- = \min(L,U)$ and $r_+ = \max(L,U)$:</p>

<ul>
  <li>$R_0 = [0, r_-)$: Predict class 0</li>
  <li>$R_M = [r_-, r_+)$: Hedge set $\{0,1\}$ (if $L &lt; U$) or abstain $\emptyset$ (if $L &gt; U$)</li>
  <li>$R_1 = [r_+, 1]$: Predict class 1</li>
</ul>

<p>The <strong>region mass</strong> (probability that $x$ falls in region $[a,b]$):</p>

\[m([a,b]) = c[F_0(b) - F_0(a)] + (1-c)[F_1(b) - F_1(a)]\]

<p>The <strong>in-region rate</strong> (posterior probability of class 1 given $x \in [a,b]$):</p>

\[\bar{p}([a,b]) = \frac{(1-c)[F_1(b) - F_1(a)]}{m([a,b])}\]

<p>All computed in <strong>closed form</strong> with simple algebra!</p>

<h2 id="information-theory">Information Theory</h2>

<p>We can measure the uncertainty of predictions using binary entropy:</p>

\[H(p) = -p \log_2(p) - (1-p) \log_2(1-p)\]

<p>This reveals an important insight: <strong>validity ≠ optimality</strong></p>

<ul>
  <li>Conformal sets with coverage guarantees can still have high entropy (uncertain predictions)</li>
  <li>Some regions may have low entropy despite being hedge sets</li>
</ul>

<h2 id="interactive-exploration">Interactive Exploration</h2>

<p>I’ve created an <a href="/tools/">interactive tool</a> where you can explore these concepts by adjusting:</p>

<ul>
  <li>Shape parameters $(a_0, b_0, a_1, b_1)$</li>
  <li>Prior probability $c$</li>
  <li>Miscoverage levels $(\alpha_0, \alpha_1)$</li>
</ul>

<p>Try it out to see how different configurations affect the region partitioning and posterior distributions!</p>

<h2 id="key-takeaways">Key Takeaways</h2>

<ol>
  <li><strong>Analytical tractability</strong>: Kumaraswamy distributions give closed-form calculations</li>
  <li><strong>No numerical integration</strong>: Everything computes exactly with elementary operations</li>
  <li><strong>Interpretable regions</strong>: Clear partitioning of feature space with meaningful posteriors</li>
  <li><strong>Coverage vs. certainty</strong>: Conformal guarantees ensure coverage but don’t guarantee low-entropy predictions</li>
</ol>

<h2 id="references">References</h2>

<p>Zwart, P.H. (2025). “Probabilistic Conformal Coverage Guarantees in Small-Data Settings.” arXiv:2509.15349</p>

<hr />

<p><em>This post accompanies my research on conformal prediction and uncertainty quantification. See the <a href="/tools/">interactive tool</a> for hands-on exploration.</em>—
title: ‘Kumaraswamy Distributions for Conformal Prediction’
date: 2025-02-03
permalink: /posts/2025/02/kumaraswamy-conformal/
tags:</p>
<ul>
  <li>conformal-prediction</li>
  <li>uncertainty-quantification</li>
  <li>
    <h2 id="machine-learning">machine-learning</h2>
  </li>
</ul>

<p>Exploring the analytical advantages of Kumaraswamy mixture models for conformal prediction with closed-form region statistics.</p>

<h2 id="introduction-1">Introduction</h2>

<p>When working with conformal prediction for binary classification, we often need to compute region masses and posteriors. The Kumaraswamy distribution offers a compelling alternative to Beta distributions because of its <strong>closed-form CDF</strong>, eliminating the need for incomplete beta functions.</p>

<h2 id="the-kumaraswamy-distribution-1">The Kumaraswamy Distribution</h2>

<p>The Kumaraswamy distribution on $[0,1]$ has PDF and CDF:</p>

\[f(x; a, b) = abx^{a-1}(1-x^a)^{b-1}\]

\[F(x; a, b) = 1 - (1-x^a)^b\]

<p>The closed-form CDF is the key advantage! For Beta distributions, we’d need:</p>

\[F_{\text{Beta}}(x; \alpha, \beta) = I_x(\alpha, \beta) \quad \text{(incomplete beta function)}\]

<h2 id="conformal-prediction-setup-1">Conformal Prediction Setup</h2>

<p>Consider a binary classification problem where we use label-conditional conformal prediction. We have:</p>

<ul>
  <li>
    <table>
      <tbody>
        <tr>
          <td>Class-conditional distributions: $f(x</td>
          <td>Y=0) \sim \text{Kumaraswamy}(a_0, b_0)$ and $f(x</td>
          <td>Y=1) \sim \text{Kumaraswamy}(a_1, b_1)$</td>
        </tr>
      </tbody>
    </table>
  </li>
  <li>Prior: $c = P(Y=0)$</li>
  <li>Miscoverage levels: $\alpha_0, \alpha_1$</li>
</ul>

<p>The conformal cutpoints are computed using the quantiles:</p>

\[L = F_0^{-1}(\alpha_1) = \left(1-(1-\alpha_1)^{1/b_0}\right)^{1/a_0}\]

\[U = F_1^{-1}(1-\alpha_0) = \left(1-\alpha_0^{1/b_1}\right)^{1/a_1}\]

<h2 id="region-statistics-1">Region Statistics</h2>

<p>We partition $[0,1]$ into three regions based on $r_- = \min(L,U)$ and $r_+ = \max(L,U)$:</p>

<ul>
  <li>$R_0 = [0, r_-)$: Predict class 0</li>
  <li>$R_M = [r_-, r_+)$: Hedge set $\{0,1\}$ (if $L &lt; U$) or abstain $\emptyset$ (if $L &gt; U$)</li>
  <li>$R_1 = [r_+, 1]$: Predict class 1</li>
</ul>

<p>The <strong>region mass</strong> (probability that $x$ falls in region $[a,b]$):</p>

\[m([a,b]) = c[F_0(b) - F_0(a)] + (1-c)[F_1(b) - F_1(a)]\]

<p>The <strong>in-region rate</strong> (posterior probability of class 1 given $x \in [a,b]$):</p>

\[\bar{p}([a,b]) = \frac{(1-c)[F_1(b) - F_1(a)]}{m([a,b])}\]

<p>All computed in <strong>closed form</strong> with simple algebra!</p>

<h2 id="information-theory-1">Information Theory</h2>

<p>We can measure the uncertainty of predictions using binary entropy:</p>

\[H(p) = -p \log_2(p) - (1-p) \log_2(1-p)\]

<p>This reveals an important insight: <strong>validity ≠ optimality</strong></p>

<ul>
  <li>Conformal sets with coverage guarantees can still have high entropy (uncertain predictions)</li>
  <li>Some regions may have low entropy despite being hedge sets</li>
</ul>

<h2 id="interactive-exploration-1">Interactive Exploration</h2>

<p>I’ve created an <a href="/tools/">interactive tool</a> where you can explore these concepts by adjusting:</p>

<ul>
  <li>Shape parameters $(a_0, b_0, a_1, b_1)$</li>
  <li>Prior probability $c$</li>
  <li>Miscoverage levels $(\alpha_0, \alpha_1)$</li>
</ul>

<p>Try it out to see how different configurations affect the region partitioning and posterior distributions!</p>

<h2 id="key-takeaways-1">Key Takeaways</h2>

<ol>
  <li><strong>Analytical tractability</strong>: Kumaraswamy distributions give closed-form calculations</li>
  <li><strong>No numerical integration</strong>: Everything computes exactly with elementary operations</li>
  <li><strong>Interpretable regions</strong>: Clear partitioning of feature space with meaningful posteriors</li>
  <li><strong>Coverage vs. certainty</strong>: Conformal guarantees ensure coverage but don’t guarantee low-entropy predictions</li>
</ol>

<h2 id="references-1">References</h2>

<p>Zwart, P.H. (2025). “Probabilistic Conformal Coverage Guarantees in Small-Data Settings.” arXiv:2509.15349</p>

<hr />

<p><em>This post accompanies my research on conformal prediction and uncertainty quantification. See the <a href="/tools/">interactive tool</a> for hands-on exploration.</em></p>]]></content><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><category term="conformal-prediction" /><category term="uncertainty-quantification" /><category term="machine-learning" /><summary type="html"><![CDATA[Exploring the analytical advantages of Kumaraswamy mixture models for conformal prediction with closed-form region statistics.]]></summary></entry><entry><title type="html">Blog Post number 4</title><link href="https://phzwart.github.io/posts/2012/08/blog-post-4/" rel="alternate" type="text/html" title="Blog Post number 4" /><published>2015-08-14T00:00:00-07:00</published><updated>2015-08-14T00:00:00-07:00</updated><id>https://phzwart.github.io/posts/2012/08/blog-post-4</id><content type="html" xml:base="https://phzwart.github.io/posts/2012/08/blog-post-4/"><![CDATA[<p>This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.</p>

<h1 id="headings-are-cool">Headings are cool</h1>

<h1 id="you-can-have-many-headings">You can have many headings</h1>

<h2 id="arent-headings-cool">Aren’t headings cool?</h2>]]></content><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><category term="cool posts" /><category term="category1" /><category term="category2" /><summary type="html"><![CDATA[This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.]]></summary></entry><entry><title type="html">Blog Post number 3</title><link href="https://phzwart.github.io/posts/2014/08/blog-post-3/" rel="alternate" type="text/html" title="Blog Post number 3" /><published>2014-08-14T00:00:00-07:00</published><updated>2014-08-14T00:00:00-07:00</updated><id>https://phzwart.github.io/posts/2014/08/blog-post-3</id><content type="html" xml:base="https://phzwart.github.io/posts/2014/08/blog-post-3/"><![CDATA[<p>This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.</p>

<h1 id="headings-are-cool">Headings are cool</h1>

<h1 id="you-can-have-many-headings">You can have many headings</h1>

<h2 id="arent-headings-cool">Aren’t headings cool?</h2>]]></content><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><category term="cool posts" /><category term="category1" /><category term="category2" /><summary type="html"><![CDATA[This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.]]></summary></entry><entry><title type="html">Blog Post number 2</title><link href="https://phzwart.github.io/posts/2013/08/blog-post-2/" rel="alternate" type="text/html" title="Blog Post number 2" /><published>2013-08-14T00:00:00-07:00</published><updated>2013-08-14T00:00:00-07:00</updated><id>https://phzwart.github.io/posts/2013/08/blog-post-2</id><content type="html" xml:base="https://phzwart.github.io/posts/2013/08/blog-post-2/"><![CDATA[<p>This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.</p>

<h1 id="headings-are-cool">Headings are cool</h1>

<h1 id="you-can-have-many-headings">You can have many headings</h1>

<h2 id="arent-headings-cool">Aren’t headings cool?</h2>]]></content><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><category term="cool posts" /><category term="category1" /><category term="category2" /><summary type="html"><![CDATA[This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.]]></summary></entry><entry><title type="html">Blog Post number 1</title><link href="https://phzwart.github.io/posts/2012/08/blog-post-1/" rel="alternate" type="text/html" title="Blog Post number 1" /><published>2012-08-14T00:00:00-07:00</published><updated>2012-08-14T00:00:00-07:00</updated><id>https://phzwart.github.io/posts/2012/08/blog-post-1</id><content type="html" xml:base="https://phzwart.github.io/posts/2012/08/blog-post-1/"><![CDATA[<p>This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.</p>

<h1 id="headings-are-cool">Headings are cool</h1>

<h1 id="you-can-have-many-headings">You can have many headings</h1>

<h2 id="arent-headings-cool">Aren’t headings cool?</h2>]]></content><author><name>Petrus H. Zwart, PhD</name><email>phzwart@lbl.gov</email></author><category term="cool posts" /><category term="category1" /><category term="category2" /><summary type="html"><![CDATA[This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.]]></summary></entry></feed>