xarray: N-D labeled Arrays and Datasets in Python

Authors

  • Stephan Hoyer Google Research, Mountain View, CA
  • Joe Hamman Department of Civil and Environmental Engineering, University of Washington, Seattle, WA

DOI:

https://doi.org/10.5334/jors.148

Keywords:

Python, pandas, netCDF, multidimensional, data, data handling, data analysis

Abstract

 xarray is an open source project and Python package that provides a toolkit and data structures for N-dimensional labeled arrays. Our approach combines an application programing interface (API) inspired by pandas with the Common Data Model for self-described scientific data. Key features of the xarray package include label-based indexing and arithmetic, interoperability with the core scientific Python packages (e.g., pandas, NumPy, Matplotlib), out-of-core computation on datasets that don’t fit into memory, a wide range of serialization and input/output (I/O) options, and advanced multi-dimensional data manipulation tools such as group-by and resampling. xarray, as a data model and analytics toolkit, has been widely adopted in the geoscience community but is also used more broadly for multi-dimensional data analysis in physics, machine learning and finance.

Downloads

Published

2017-04-05

Issue

Section

Software Metapapers