Information Theory for Machine Learning: Theorems, Proofs, and Python Implementations (Computational Mathematics Library)
Format:
Paperback
En stock
0.87 kg
Sí
Nuevo
Amazon
USA
- The complete graduate-level reference for entropy, divergence, and mutual information in modern machine learning, rigorously developed from measure theory to contemporary estimators and algorithms.Measure-theoretic foundations: sigma-algebras, Radon-Nikodym, conditional expectation, change of measure.Core measures: entropy, cross-entropy, KL, mutual information; f-divergences and Renyi divergences with variational dualities (Fenchel, Donsker-Varadhan).Data processing and fundamental inequalities: log-sum, Pinsker, Csiszar-Kullback-Pinsker, Fano, Le Cam, Assouad; equality conditions and sufficiency.Gaussian tools: entropy power inequality, de Bruijn identity, Fisher information, I-MMSE, Gaussian extremality.Maximum entropy and exponential families; log-partition convexity, Bregman geometry, Pythagorean theorems.Fisher information and asymptotics: score, Cramer-Rao bounds, LAN, Bernstein-von Mises, asymptotic efficiency.Information geometry and natural gradients: Fisher-Rao metric, dual connections, mirror descent.Source coding and MDL: Kraft-McMillan, NML, universal coding, compression-generalization links.Generalization: PAC-Bayes bounds, mutual information bounds I(W;S), stability of SGD.Concentration via information: DV method, log-Sobolev and Poincare inequalities, transportation T1/T2, hypercontractivity.Variational inference and divergence minimization: ELBO, alpha-divergences, EP, black-box VI with reparameterization.Estimating entropy and MI: plug-in, kNN, KDE, Kraskov, MINE, InfoNCE; minimax rates and consistency.Rate-distortion and information bottleneck: Blahut-Arimoto, optimal encoders, sufficiency-compression trade-offs.Contrastive representation learning under augmentations: alignment vs uniformity, identifiability, sample complexity.Generative modeling: VAEs, bits-back coding, beta-VAE, TCVAE; likelihood calibration and posterior collapse.Score matching and Stein: Fisher divergence, kernel Stein discrepancies; diffusion models as score-based SDEs with likelihood estimation.Optimal transport with entropic regularization: Kantorovich duality, Sinkhorn, Schrodinger bridges; OT vs f-divergence objectives.Distributed and federated learning under communication limits: quantization, gradient coding, lower bounds via information.Privacy and leakage: differential privacy, Renyi DP, moments accountant; accuracy-privacy trade-offs and inference risks.Active learning and Bayesian experimental design: expected information gain, submodularity, scalable estimators.
IMPORT EASILY
By purchasing this product you can deduct VAT with your RUT number