Skip to content

fix DataFrame fillna for pandas 3.x compatibility#855

Merged
Swanson-Hysell merged 1 commit into
masterfrom
fix_fillna_pandas_issue
May 7, 2026
Merged

fix DataFrame fillna for pandas 3.x compatibility#855
Swanson-Hysell merged 1 commit into
masterfrom
fix_fillna_pandas_issue

Conversation

@Swanson-Hysell

Copy link
Copy Markdown
Member

pandas 3.0 raises TypeError when filling NaN in a numeric column with an empty string, breaking the common df.fillna("", inplace=True) pattern used throughout the MagIC writer and lab-instrument converters. Replace with df = df.astype(object).fillna("") at every call site so the result is consistent across pandas 2.x and 3.x.

Two chained-assignment sites in convert_2_magic.py were already silently broken under pandas 3's copy-on-write rules; the rewrite fixes those too.

Fixes #840.

pandas 3.0 raises TypeError when filling NaN in a numeric column
with an empty string, breaking the common df.fillna("", inplace=True)
pattern used throughout the MagIC writer and lab-instrument
converters. Replace with df = df.astype(object).fillna("") at every
call site so the result is consistent across pandas 2.x and 3.x.

Two chained-assignment sites in convert_2_magic.py were already
silently broken under pandas 3's copy-on-write rules; the rewrite
fixes those too.

Fixes #840.
@Swanson-Hysell

Copy link
Copy Markdown
Member Author

Risks

  • astype(object) casts every column to object dtype. Most call sites feed straight into to_dict('records') or a tab-file write, but any downstream action that does something numerical on the returned DataFrame will see a dtype change.
  • Two chained-assignment sites in convert_2_magic.py (lines 3602, 3718) were silently a no-op on pandas 3.x; the fix makes them actually replace NaN values, matching pandas 2.x behavior.

Review testing

  • Run the full test suite on both pandas 2.x and 3.x
  • Spot-check some lab-instrument converters end-to-end against the example data in data_files/ and diff the output against what the same call produces on pandas 2.x.
  • Run example notebooks (PmagPy_calculations.ipynb, PmagPy_MagIC.ipynb) on pandas 3.x.

@Swanson-Hysell

Copy link
Copy Markdown
Member Author

Test results

Tested by running MagIC_contribution.ipynb (which exercises ipmag.upload_magic, ipmag.combine_magic, and several magic_write calls — i.e. the main fillna call sites this PR touches) end-to-end in two fresh venvs:

pandas 2.3.3 pandas 3.0.2
master pass failLossySetitemErrorTypeError: Invalid value '' for dtype 'float64' at pmagpy/ipmag.py:5785
fix_fillna_pandas_issue pass pass

Confirms the PR resolves the pandas-3.x regression while remaining compatible with pandas 2.x.

@Swanson-Hysell

Copy link
Copy Markdown
Member Author

Spot-check: convert_2_magic with example data

Ran four converters touched by the PR end-to-end with each pandas version, using data_files/:

  • convert._2g_bindata_files/convert_2_magic/2g_bin_magic/mn1/mn001-1a.dat
  • convert._2g_ascdata_files/convert_2_magic/2g_asc_magic/_2g_asc/DR3B.asc
  • convert.iodp_srm_loredata_files/iodp_magic/U999A/SRM_archive_data/srmsection_17_5_2019.csv
  • convert.iodp_dscr_loredata_files/iodp_magic/U999A/SRM_discrete_data/srmdiscrete_17_5_2019.csv

All four ran cleanly and produced identical record counts under both pandas 2.3.3 and pandas 3.0.2.

diff -r across the two output trees shows a single line difference: in _2g_bin/measurements.txt, an integer-valued temperature column stringifies as 273.0 under pandas 2.x and 273 under pandas 3.x. Functionally equivalent (both round-trip to 273 K) but driven by a behavior change in astype(object) for integer-valued floats. All other output files (specimens, samples, sites, locations, measurements across the four converters) are identical between the two pandas versions.

@Swanson-Hysell Swanson-Hysell merged commit e215595 into master May 7, 2026
2 checks passed
@Swanson-Hysell Swanson-Hysell deleted the fix_fillna_pandas_issue branch May 7, 2026 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pmag.magic_write() fails with pandas 3.x when DataFrame contains float columns

1 participant