Experiments on Image Datasets

8.1 Experiments on Image Benchmark datasets

For the PR-Reducing flows, the final scale ratio between preserved vs. shrunken dimensions for finite integration times is dependent on the quantity \( e^{\rho(g_* -g_i)T} \). Therefore, for fixed end integration time \( T \) and rate \( \rho \), this scaling is dictated by \( g_* - g_i \), which we call the “inflation gap” (IG, Appendix B.2). As this inflation gap increases, compressed dimensions are shrunken to a greater extent, and the denoising networks are required to amortize score estimation over wider noise scales, a harder learning problem. Therefore, for our proposed model, compression should be understood both in terms of the number of dimensions being preserved and the size of this inflation gap.

To assess how these two factors affect model performance, we performed two sets of experiments on two benchmark image datasets (CIFAR-10 Krizhevsky, 2009 and AFHQv2 Choi et al., 2020; Appendix B.4.2). In the first set of experiments, we fixed \( T \), \( \rho \), and the inflation gap (\( \text{IG} = 1.02 \)) while varying only the number of preserved dimensions \( d \) between \( d=1 \) (compression to \( \approx 0.03\% \)) and \( d=3072 \) (no compression) for both datasets (see Tables 1, 2 below - values represent mean \( \pm 2 \sigma \) over 3 sets of seeds, each with either 50K samples for FID scores, or 10K samples, for round-trip MSE).

Table 1: FID and Round-Trip MSE for AFHQv2 at Constant Inflation Gap (IG= 1.02)

Dimensions	FID	MSE
1	12.65 ± 0.07	1.47 ± 0.07
2	11.95± 0.06	1.55± 0.21
30	13.64± 0.02	3.79± 0.13
62	14.05± 0.18	5.32± 0.18
307	15.64± 0.10	3.33± 0.13
615	14.63± 0.07	2.42± 0.18
1536	13.36± 0.12	0.14± 0.03
3041	13.97± 0.13	0.28± 0.06
3072	11.90± 0.08	0.38± 0.04

Table 2: FID and Round-Trip MSE for CIFAR-10 at Constant Inflation Gap (IG= 1.02)

Dimensions	FID	MSE
1	20.76 ± 0.09	1.07 ± 0.10
2	21.29± 0.04	0.81± 0.11
30	23.36± 0.14	2.21± 0.08
62	23.30± 0.19	2.27± 0.24
307	28.07± 0.13	0.71± 0.02
615	24.49± 0.27	0.29± 0.03
1536	17.44± 0.16	0.16± 0.06
3041	16.60± 0.05	0.30± 0.02
3072	17.01± 0.10	0.22± 0.03

For the second set of experiments, we worked with the AFHQv2 dataset and fixed \( T \), \( \rho \), and \( d=2 \), while varying the inflation gap ( \( \text{IG} = 1.10, 1.25, 1.35, 1.50 \), see Table 3 below, same set up as before).

Table 3: FID and Round-Trip MSE for AFHQv2 at Varying Inflation Gaps (IGs)

Dimensions	IG	FID	MSE
2	1.02	11.95 ± 0.06	1.55 ± 0.21
2	1.10	13.98± 0.13	1.35± 0.08
2	1.25	17.84± 0.15	1.65± 0.09
2	1.35	34.68± 0.37	1.19± 0.18
2	1.50	107.64± 0.43	0.11± 0.02

Finally, we also compared our inflationary flows (IFs) model generative performance on CIFAR-10 against three existing injective flow model baselines (Appendix B.5.2) — M-Flows (Brehmer & Cranmer, 2020), Rectangular Flows (RFs, Caterini et al., 2021), and Canonical Manifold Flows (CMFs, Flouris & Konukoglu, 2023) — for different numbers of preserved dimensions (\( d=30, 40, 62 \)). Table 4 below showcases best FID scores (out of 3 independently generated sets of images, each with 10K samples) for each such experiment. For these comparison experiments, we fixed \( \text{IG}=1.02 \) when training our networks for the different \( d \) values.

Table 4: FID Score Comparison with Injective Flows for CIFAR-10

Dimensions	IFs (IG=1.02)	M-Flow	RFs	CMFs
30	23.3	541.2	544.0	532.6
40	24.3	535.7	481.3	444.6
62	23.2	280.9	280.8	287.9

As a general trend, increasing the number of preserved dimensions at a constant inflation gap led to improvements in generative quality (lower FID scores) and reduced MSE (Tables 1,2). However, some schedules we assessed are not entirely consistent with this trend. We hypothesize this is at least partially due to variance arising from different network initializations for each schedule, as well as differences between the two datasets explored here. As expected, increasing inflation gap while maintaining the number of preserved dimensions leads to worsened generative performance (higher FID scores, Table 3). Finally, in terms of predictive calibration, our model provides substantial gains when compared to existing injective flow model baselines (Table 4).

8.2 Animations for Image Dataset Experiments

To see animations for sample generation (FID) and round-trip (MSE) experiments under select schedules, please check the links below!

8.1 Experiments on Image Benchmark datasets

8.2 Animations for Image Dataset Experiments

AFHQv2, Constant Inflation Gap (IG = 1.02) Experiments

CIFAR-10, Constant Inflation Gap (IG = 1.02) Experiments

AFHQv2, Varying Inflation Gaps (IG = 1.10, 1.25, 1.35, 1.50) Experiments