Family hierarchies: how PhotoStructure parses names
PhotoStructure extracts “Who” tags from MWG-compatible face regions (as written by Picasa, Adobe Lightroom, digiKam, and others), Google Takeout JSON sidecars, and configurable metadata fields. By default, names are stored flat – Who/Albert Einstein. The powerful name parsing engine built into PhotoStructure can organize names into a browseable hierarchy like Who/Einstein/Albert, handling a wide variety of naming conventions and edge cases along the way.
๐ท๏ธ Choosing a formatter
The tagNamesFormatter setting controls how names are structured into tags. There are three options:
as-is (default)
Names go directly under the Who root with no parsing:
| Input | Tag |
|---|---|
Albert Einstein | Who/Albert Einstein |
Ludwig van Beethoven | Who/Ludwig van Beethoven |
This is the default because splitting given and family names correctly across all naming conventions is hard to automate.
family/given
Parses names into a Who/FamilyName/GivenNames hierarchy:
| Input | Tag(s) |
|---|---|
Albert Einstein | Who/Einstein/Albert |
Ludwig van Beethoven | Who/van Beethoven/Ludwig |
Michelle (Robinson) Obama | Who/Robinson/Michelle and Who/Obama/Michelle |
Works well for genealogy or large libraries with many people, since you can browse by family name.
family/fullname
Like family/given, but the leaf tag retains the original full name:
| Input | Tag |
|---|---|
Albert Einstein | Who/Einstein/Albert Einstein |
You get the browseable hierarchy but the full name stays visible.
๐งช Testing your settings
You can preview how PhotoStructure will parse names in your files before committing to a formatter. Use the info tool with environment variable overrides:
Try running:
tagNamesFormatter=as-is ./photostructure info \
path/to/image.jpg --filter tags
versus
tagNamesFormatter=family/given ./photostructure info \
path/to/image.jpg --filter tags
Any of the settings described on this page can be overridden via environment variables for testing.
๐ Name order
The tagNamesOrder setting controls how PhotoStructure splits a name into given and family components when neither is explicitly marked.
givenFirst (default)
Assumes given names come first, family name last:
Albert Einsteinโ given: “Albert”, family: “Einstein”
In givenFirst mode, the ALL-CAPS-as-family-name heuristic is active (see step 6), and tokens following a matched family name are treated as additional family names.
surnameFirst
Assumes family name comes first, given names after:
ๅฎฎๅด ้งฟโ family: “ๅฎฎๅด”, given: “้งฟ”
If your library contains a mix of both conventions, you may want to stick with as-is formatting, or pick the convention that covers the majority of your names and use tagNamesGiven and tagNamesSurnames to handle the exceptions.
๐ฅ How names are parsed
When you use the family/given or family/fullname formatter, PhotoStructure runs each name through these steps, in order:
Whitespace normalization – extra spaces are collapsed.
Lifespan extraction – patterns like
(1930-2001)are recognized and removed from the name string.Modifier extraction – suffixes like
Jr.,Sr.,III,Junior,Seniorat the end of the name are extracted.Multi-word given name matching – names listed in
tagNamesGiven(like “Mary Kay”) are matched as a unit instead of being split.Nickname/alias extraction – text within
tagNamesGivenSurroundscharacters (default:[]and"") is kept with the given name. Example:Joe "Joey" Smithโ given name becomesJoe "Joey".ALL-CAPS detection – if
tagNamesCapitalizedAsFamilyis enabled (default:true) andtagNamesOrderis"givenFirst", words written in ALL CAPS are treated as family names. This is common in genealogy software and only applies ingivenFirstmode since it’s a Latin-script convention. Example:John SMITHโ family: “SMITH”.Maiden/alternate family name extraction – text within
tagNamesFamilySurroundscharacters (default:()) is treated as an additional family name. Example:Michelle (Robinson) Obamagenerates two tags, one under Robinson and one under Obama.Lexical “Last, First” detection – if
tagNamesLexicalis enabled (default:true), a comma triggers “family, given” parsing. Example:Einstein, Albertโ family: “Einstein”, given: “Albert”.Configured family names – names listed in
tagNamesSurnamesare matched.Surname prefix matching – prefixes from
tagNamesSurnamePrefixes(like “van”, “von der”, “De la”, “Mc”) are kept attached to the following word as a compound family name. Example:Ludwig van Beethovenโ family: “van Beethoven”.Remaining words – any unmatched words are distributed based on
tagNamesOrder: ingivenFirstorder, the last word becomes the family name; insurnameFirstorder, the first word does.
โ๏ธ Settings reference
All settings can be configured via settings.toml or via environment variables.
tagNamesFormatter
Env: PS_TAG_NAMES_FORMATTER
Default: "as-is"
Values: "as-is", "family/given", "family/fullname"
Controls how “Who” tags are structured. See Choosing a formatter above.
tagNamesOrder
Env: PS_TAG_NAMES_ORDER
Default: "givenFirst"
Values: "givenFirst", "surnameFirst"
Deprecated aliases: "western" โ "givenFirst", "eastern" โ "surnameFirst"
Controls how names are interpreted during parsing (which heuristics apply, how ambiguous tokens are assigned) and rendering (display order in family/fullname mode). See Name order above.
tagNamesDefaultFamily
Env: PS_TAG_NAMES_DEFAULT_FAMILY
Default: "-"
When a name has no detectable family name, this value is used as the family component. Only applies to the family/given and family/fullname formatters.
Example: Madonna โ Who/-/Madonna
Set to an empty string to omit the family level entirely.
tagNamesGiven
Env: PS_TAG_NAMES_GIVEN
Default: []
Multi-word given names that should be kept together instead of being split. Matching is case-insensitive.
tagNamesGiven = ["Mary Kay", "Rose Marie", "Jean-Pierre"]
tagNamesSurnames
Env: PS_TAG_NAMES_SURNAMES
Default: []
Multi-word family names that should be matched explicitly. Useful for compound surnames in languages that don’t use hyphens or recognized prefixes.
tagNamesSurnames = ["St Clair", "El Fassi"]
tagNamesSurnamePrefixes
Env: PS_TAG_NAMES_SURNAME_PREFIXES
Default: Common European surname prefixes including "van", "von", "De la", "Mc", "Mac", "D'", and others. See settings.toml for the full list.
Prefixes that are kept attached to the following word as a compound family name. Matched case-insensitively and sorted longest-first internally.
tagNamesFamilySurrounds
Env: PS_TAG_NAMES_FAMILY_SURROUNDS
Default: ["()"]
Character pairs that mark alternate or maiden family names. Each value is a two-character string: the opening and closing character.
Example with the default ():
Michelle LaVaughn (Robinson) ObamaโWho/Robinson/Michelle LaVaughnandWho/Obama/Michelle LaVaughn
tagNamesGivenSurrounds
Env: PS_TAG_NAMES_GIVEN_SURROUNDS
Default: ["[]", "\"\""]
Character pairs that mark nicknames or aliases to append to the given name. The surround characters are retained in the output.
Example:
Joe "Joey" SmithโWho/Smith/Joe "Joey"
tagNamesCapitalizedAsFamily
Env: PS_TAG_NAMES_CAPITALIZED_AS_FAMILY
Default: true
When enabled, words written in ALL CAPS (at least 3 characters) are treated as family names. This is a common convention in genealogy software.
Example: John SMITH โ Who/SMITH/John
tagNamesLexical
Env: PS_TAG_NAMES_LEXICAL
Default: true
When enabled, a comma in the name triggers lexical (bibliographic) parsing: everything before the comma is the family name, everything after is the given name.
Example: Einstein, Albert โ Who/Einstein/Albert
Note: trailing modifiers like “Jr.” and “Senior” are extracted before comma parsing runs, so John Smith, Jr. is parsed as family: “Smith”, given: “John”, modifier: “Jr.” โ the comma doesn’t trigger lexical mode in this case.
๐ Common use cases
Genealogy
The family/given formatter is a natural fit for genealogy:
tagNamesFormatter = "family/given"
The defaults for tagNamesCapitalizedAsFamily and tagNamesLexical are already true, which handles the two most common genealogy naming conventions (ALL-CAPS surnames and “Last, First” format). Use tagNamesFamilySurrounds to capture maiden names in parentheses.
International names
For libraries primarily containing East Asian names (family name first):
tagNamesFormatter = "family/given"
tagNamesOrder = "surnameFirst"
For mixed libraries where some names follow givenFirst and some surnameFirst conventions, as-is may be the safest choice unless you’re willing to maintain tagNamesGiven and tagNamesSurnames lists for the exceptions.
Google Takeout imports
Face names from Google Takeout JSON sidecars are processed through this same engine. If you’re importing a Takeout archive with many face-tagged photos, choosing family/given before import will organize all those names into a browsable hierarchy.
Large family photo libraries
For libraries with hundreds of tagged people, family/given makes the Who tag tree much more navigable – you can browse by family name first, then drill down to individuals.
Pre-populate tagNamesGiven and tagNamesSurnames for any names in your collection that don’t follow standard patterns.
๐ Where do “Who” names come from?
PhotoStructure reads person names from three sources:
tagFaceRegions(default:true) – Extracts names from face region metadata following the MWG face region standard, as written by Adobe Lightroom, digiKam, Picasa, and many other tools.tagJsonFaces(default:true) – Extracts names from thepeopleNamesfield in Google Takeout JSON sidecar files.whoTags– A configurable list of metadata fields to search for person names. Defaults includePeople,PersonInImage, and several other standard fields.
