References & Further Reading

References

  1. F. Perez and B. E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 2007

    The foundational paper on IPython, describing the architecture of the interactive computing environment that evolved into Jupyter.

  2. T. Kluyver et al., Jupyter Notebooks — A Publishing Format for Reproducible Computational Workflows, ELPUB, 2016

    Describes the Jupyter notebook format and its role in reproducible computational research.

  3. W. McKinney, Python for Data Analysis, O'Reilly, 2017

    The definitive Pandas book, written by the creator of Pandas. Covers DataFrames, groupby, merge, and time series in depth.

  4. H. Wickham, Tidy Data, Journal of Statistical Software, 2014

    Formalizes the concept of tidy data: each row is an observation, each column is a variable. Foundational for modern data analysis.

  5. M. Wouts, jupytext Documentation, 2024

    Official documentation for jupytext, covering pairing formats, configuration, and integration with JupyterLab.

Further Reading

  • Advanced Pandas techniques

    M. Harrison, *Effective Pandas*, 2nd ed., 2022

    Covers advanced topics like MultiIndex, method chaining, window functions, and performance optimization.

  • Notebook best practices

    J. VanderPlas, *Reproducible Data Analysis in Jupyter* (YouTube series)

    Practical guidelines for structuring notebooks for reproducibility, including git integration strategies.

  • Interactive widgets

    ipywidgets documentation: https://ipywidgets.readthedocs.io/

    Complete reference for building interactive notebook interfaces with sliders, buttons, and output widgets.

  • papermill for production notebooks

    Netflix tech blog: https://netflixtechblog.com/scheduling-notebooks-348e6c14cfd6

    Netflix's approach to using papermill for production data pipelines, running hundreds of notebooks daily.