Add Turkic patronymic detection to patronymic_name_order#198
Merged
Conversation
Makes room for a second, structurally different patronymic family (Turkic). Pure rename, no behavior change — single call site, not referenced by name in docs.
Extends patronymic_name_order to handle the Azerbaijani/Central-Asian 4-token shape (Surname GivenName PatronymicRoot Marker), e.g. 'Aliyev Vusal Said oglu'. Standalone marker words, not suffixes, so detection is whole-word matched against a strict 4-token guard (#185).
…cstring The Constants class-attribute docstring (the canonical Sphinx-linked API doc, referenced via :py:obj: from docs/customize.rst) only described the East-Slavic behavior, drifting from the customize.rst prose already updated for Turkic support. Adds the missing sentence and a second doctest example (#185).
Unlike its Latin sibling and the new turkic_patronymic_marker_cyrillic
pattern, this pattern had no re.I flag. The irregular-form alternatives
(ильич, кузьмич, лукич, фомич, фокич) are short enough that the
capitalized first letter falls within the matched suffix itself, so
capitalized real-world patronymics like "Ильич" failed to match and
HumanName("Иванов Иван Ильич", constants=Constants(patronymic_name_order=True))
did not rotate, while the equivalent Latin-script name did.
Also strengthens test_no_regex_collision_latin/_cyrillic with positive
sanity assertions confirming each word list actually matches its own
family's regex, so the non-collision assertions are non-vacuous.
…guard asymmetry Addresses PR review feedback on #198: - Add a regex-level test confirming is_turkic_patronymic_marker() only matches whole marker words, not substrings (e.g. "ogluu", "Bogluchik"), guarding against a future accidental .match()->.search() swap. - Soften docs/customize.rst's "mirroring the strictness of the East-Slavic guard" claim, which overstated parity: the Turkic guard has no middle-token disambiguation check analogous to East-Slavic's, since marker words are a small closed set unlikely to coincide with an ordinary given name.
…atronymic_name_order Consistency fix: this handler only ever implemented East-Slavic rotation logic (sibling to handle_turkic_patronymic_name_order()), so the generic name was misleading now that a family-specific sibling exists — the same asymmetry that is_patronymic -> is_east_slavic_patronymic already fixed for the helper method. The patronymic_name_order flag itself stays generic, since it's the public umbrella opt-in switch covering both families by design. Pure rename, zero behavior change — single call site, not referenced by name in shipped docs.
…mic_order.py Consistency with the handle_east_slavic_patronymic_name_order() and is_east_slavic_patronymic() renames, and mirrors the naming of the sibling test_turkic_patronymic_order.py. Pure file rename, no content changes — not referenced by path anywhere outside gitignored planning docs.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
patronymic_name_orderflag to also detect and rotate reversed, no-comma Azerbaijani/Central-Asian Turkic formal-order names (Surname GivenName PatronymicRoot Marker, e.g."Aliyev Vusal Said oglu"→ first=Vusal, middle="Said oglu", last=Aliyev), alongside the existing East-Slavic rotation (closes Support Turkic patronymics (oglu/qizi/uly) in patronymic_name_order #185)is_patronymic()→is_east_slavic_patronymic()and thepatronymic/patronymic_cyrillicregex keys toeast_slavic_patronymic/east_slavic_patronymic_cyrillic, now that a second, structurally different patronymic family exists (pure rename, single call site, zero behavior change)east_slavic_patronymic_cyrillicwas missingre.I, so capitalized irregular-form patronymics like"Ильич"failed to match and rotate, while the Latin equivalent ("Ilyich") worked fineTest plan
uv run pytest tests/ -q(1148 passed, 22 xfailed)uv run mypy nameparser/cleanuv run ruff check nameparser/ tests/cleanuv run sphinx-build -b html docs docs/_build -q -WcleanHumanName("Иванов Иван Ильич", constants=Constants(patronymic_name_order=True))now rotates correctly🤖 Generated with Claude Code