Abstract: Understanding the geographic reach and community structure of one’s scholarly citations is increasingly
valuable for career development, grant applications, and collaboration discovery—yet accessible tools for
answering these questions remain scarce. Existing bibliometric platforms either require costly institutional
subscriptions or expose only aggregate citation counts without granular per-author metadata.
We present CiteRadar, an open-source system that accepts a single Google Scholar user identifier and
automatically produces a structured output folder containing: the author’s complete publication list, all
retrieved citing papers with enriched author metadata, two ranked author tables (by citation frequency and
by h-index), a plain-text statistical summary, and a self-contained interactive HTML world map—all from
a single command-line invocation. CiteRadar integrates five heterogeneous data sources—Google Scholar,
OpenAlex, CrossRef, Semantic Scholar, and OpenStreetMap Nominatim—through a carefully engineered
five-stage pipeline. Key technical contributions include: (1) a Scholar meta-string parser resilient to Unicode
non-breaking-space separators, a pervasive but undocumented quirk in Scholar’s HTML that silently corrupts
venue and year fields when unhandled; (2) a two-stage author disambiguation system using stop-word-
filtered institution name similarity to guard against the well-known same-name entity-merging failure
mode in bibliometric databases, demonstrated to eliminate h-index attribution errors of up to 9×the correct
value; (3) an OpenAlex web-URL to API-URL conversion fix that raises the fraction of author records with
city-level location data from 0% to≈60%; and (4) a logarithmically-scaled interactive Folium world map
with per-city researcher popups, rendered as a fully self-contained HTML file. CiteRadar is available at
https://github.com/chenxuniu/citeradar and installable via pip install citeradar.
Loading