References & Further Reading

References

  1. P. J. Huber, Robust Estimation of a Location Parameter, 1964

    Founding paper of robust statistics; introduces the Huber loss and proves its minimax property over epsilon-contamination neighborhoods.

  2. P. J. Huber and E. M. Ronchetti, Robust Statistics, Wiley, 2nd ed., 2009

    Comprehensive modern reference on M-estimators, influence functions, and breakdown points.

  3. F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw, and W. A. Stahel, Robust Statistics: The Approach Based on Influence Functions, Wiley, 1986

    Definitive treatment of the influence-function approach to robustness.

  4. P. J. Rousseeuw and A. M. Leroy, Robust Regression and Outlier Detection, Wiley, 1987

    High-breakdown estimators (LMS, LTS, S-estimators) and the breakdown-point concept for regression.

  5. E. Parzen, On Estimation of a Probability Density Function and Mode, 1962

    Formalizes kernel density estimation and establishes consistency.

  6. M. Rosenblatt, Remarks on Some Nonparametric Estimates of a Density Function, 1956

    Earliest kernel density estimator; impossibility result for unbiased density estimation.

  7. B. W. Silverman, Density Estimation for Statistics and Data Analysis, Chapman and Hall, 1986

    Standard textbook on KDE, bandwidth selection, and the rule of thumb.

  8. E. A. Nadaraya, On Estimating Regression, 1964

    Kernel regression estimator as a weighted local average.

  9. G. S. Watson, Smooth Regression Analysis, 1964

    Independently proposes the Nadaraya-Watson estimator and establishes its asymptotics.

  10. J. Fan and I. Gijbels, Local Polynomial Modelling and Its Applications, Chapman and Hall, 1996

    Treatment of local polynomial regression; corrects the boundary bias of Nadaraya-Watson.

  11. N. Aronszajn, Theory of Reproducing Kernels, 1950

    Founding paper of RKHS theory.

  12. B. Scholkopf, R. Herbrich, and A. J. Smola, A Generalized Representer Theorem, 2001

    Proves the representer theorem for any monotone regularizer on the RKHS norm.

  13. B. Scholkopf and A. J. Smola, Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond, MIT Press, 2002

    Textbook on kernel methods spanning SVMs, kernel PCA, and Gaussian processes.

  14. V. N. Vapnik, Statistical Learning Theory, Wiley, 1998

    SVMs, VC dimension, and structural risk minimization.

  15. C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning, MIT Press, 2006

    Definitive text on GP regression, hyperparameter learning, and sparse approximations.

  16. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning, MIT Press, 2016

    Standard reference on neural networks, regularization, and training.

  17. J. R. Hershey, J. Le Roux, and F. Weninger, Deep Unfolding: Model-Based Inspiration of Novel Deep Architectures, 2014

    Coins the term 'deep unfolding' and formalizes the model-to-network mapping.

  18. K. Gregor and Y. LeCun, Learning Fast Approximations of Sparse Coding, 2010

    Introduces LISTA: unrolled ISTA with learned per-layer weights.

  19. V. Monga, Y. Li, and Y. C. Eldar, Algorithm Unrolling: Interpretable, Efficient Deep Learning for Signal and Image Processing, 2021

    Comprehensive tutorial on deep unfolding in signal processing.

  20. S. Haghighatshoar, G. Caire, Low-Complexity Massive MIMO Subspace Estimation and Tracking From Low-Dimensional Projections, 2018
  21. H. Sarieddeen, G. Caire, Data-Driven Recovery in RF Imaging via Unfolded Orthogonal AMP, 2021