The untold story of palmerpenguins

Authors

Allison Horst

Alison Hill

Kristen B. Gorman

Published

June 21, 2022

Abstract: The palmerpenguins R package provides a modern, approachable dataset containing body size measurements for three penguin species that nest on islands throughout the Palmer Archipelago, Western Antarctic Peninsula. Since palmerpenguins was released on the Comprehensive R Archive Network (CRAN) in July 2020, the package has been downloaded over 340,000 times, was quickly adapted for use in other languages including Python’s seaborn package and Google’s TensorFlow datasets, and has become a go-to option for data science and statistics educators worldwide. In this talk, we share the untold story of the palmerpenguins package. From original data collection on rocky Antarctic shores to CRAN submission and beyond, we describe the penguins’ journey from polar research project to global teaching product. What started out as a simple methods paper for a dissertation project turned into a widely used data science product mainly because of initial efforts to make the data publicly available and easily accessible by others. The success of the palmerpenguins R package underscores the importance of proper data archiving for unknown future applications.

Presented virtually at the useR!2022 Conference. [Slides]