Project: Alexander Kustov Academic Website
Project: Alexander Kustov Academic Website
Overview
Jekyll 4.3 academic website using the AcademicPages remote theme (Minimal Mistakes fork). Hosted on GitHub Pages. Multilingual: 12 languages (EN, ES, FR, PT, DE, IT, PL, RU, JP, KO, TR, AR).
Architecture
Languages & Directories
| Code | Language | Directory | Notes | |——|———-|———–|——-| | en | English | _pages/ | Default, no prefix | | es | Spanish | es/ | Latin American conventions | | fr | French | fr/ | Metropolitan French | | pt | Portuguese | pt/ | Brazilian Portuguese (pt-BR) | | de | German | de/ | Hochdeutsch | | it | Italian | it/ | | | pl | Polish | pl/ | | | ru | Russian | ru/ | Cyrillic script | | jp | Japanese | jp/ | CJK; site uses “jp” internally, hreflang maps to “ja” for SEO | | ko | Korean | ko/ | CJK | | tr | Turkish | tr/ | | | ar | Arabic | ar/ | RTL script; dir="rtl" set on <html> |
Key Directories
_layouts/– Jekyll layouts (see Layouts section below)_includes/– Partials (see Key Includes section below)_sass/– SCSS (vendor/susy grid, Minimal Mistakes theme, Breakpoint, Font Awesome 5, Magnific Popup, Academicons)_data/– Data files (navigation.yml, ui-text.yml, authors.yml, carousel.yml, newsletter_posts.yml)_translations/source/– English source markdown for newsletter posts (13 files)_translations/agent_{1,2,3}/{lang}/– Translation agent outputs (working files){lang}/newsletter/– Final translated newsletter posts (live on site)assets/css/– Custom CSS:custom.css(dark mode, RTL, layout),newsletter-post.css,academicons.css,collapse.css,main.scssassets/js/– JavaScript:main.min.js(bundles jQuery 1.12.4 + plugins),_main.js(source),collapse.js,plugins/(jQuery plugins),vendor/(jQuery source)talkmap/– Leaflet-based talk location maps (inactive,talkmap_link: falsein config)markdown_generator/– Jupyter notebooks & Python scripts for TSV-to-markdown conversion (legacy tooling)
Layouts
default– Base layout (extendscompress); includes masthead, footer, dark mode toggle, text-expandcompress– HTML compression wrapper layoutsingle– Standard page layout with sidebararchive– Collection/list pagesarchive-taxonomy– Category/tag archive pagesnewsletter-post– Substack-inspired minimal reading layout (680px centered, serif body, hero image, footnotes, JSON-LD Article schema)splash– Full-width layout without sidebartalk– Specialized layout for talks/presentations
Key Includes
masthead.html– Top navigation with language dropdown (readssite.languagesfrom_config.yml)language-switcher.html– Inline language links (readssite.languagesfrom_config.yml)hreflang.html– SEO alternate language tags; mapsjptojafor hreflangseo.html– Open Graph, Twitter cards, canonical URLs, og:locale per languageauthor-profile.html– Sidebar author section with social linksevents-sidebar.html– Book tour/events sidebar (activated viashow_events: true)text-expand.html– Vanilla JS[expand]/[/expand]block expanderhead/custom.html– Favicons, MathJax 3.x, dark mode flash prevention CSS, external CSS linksanalytics.html+analytics-providers/custom.html– Google Analytics (gtag, measurement ID:G-J78N1YFWN8)footer.html– Copyright, theme attribution, AI translation disclosure for non-EN pages
Dark Mode System
- Toggle button:
#dark-mode-togglein masthead with sun/moon icons - Storage:
localStorage.setItem('darkMode', true/false)for persistence across pages - Flash prevention: Inline CSS in
head/custom.htmlwithhtml.dark-mode-pendingclass applied before body renders - JS implementation: In
_layouts/default.html– reads localStorage, appliesbody.dark-modeclass, toggles on click - Styles:
assets/css/custom.css– 400+ lines covering all components (masthead, sidebar, tables, accordions, newsletter cards, blockquotes, code blocks, events sidebar) - Color palette: Background
#1a1a2e, text#d4d4dc, links#6eaadc, masthead#16162a
Configuration Reference (_config.yml)
languages:– Array of{code, label}objects defining supported languages and display order (used by masthead, language-switcher)analytics.provider: "custom"– Uses custom gtag.js includeanalytics.google.measurement_id: "G-J78N1YFWN8"– GA4 measurement IDtalkmap_link: false– Disables talkmap feature on talks pagecompress_html:– HTML compression (clippings: all, ignored in development)future: true– Allows future-dated postswhitelist:– Plugins allowed in--safemode (for GitHub Pages compatibility)- Collections defined:
teaching,publications,portfolio,talks(mostly empty, inherited from theme)
Newsletter System
- English newsletter cards (
_pages/newsletter.md) link to Substack URLs - Non-English newsletter cards (
{lang}/newsletter.md) link to local translated pages at/{lang}/newsletter/{slug}/ - Each translated post uses
layout: newsletter-postwith front matter: title, subtitle, date, permalink, lang, ref, original_url, image - Hero images served from Substack CDN (substackcdn.com)
- Footnotes use kramdown native syntax:
[^1]with[^1]: textat end
Build & Test
# Build site
bundle exec jekyll build
# Serve locally
bundle exec jekyll serve --port 4000 --no-watch
# Build takes ~20-30 seconds with all translations
# Rebuild minified JS (requires npm/node)
npm run build:js
# This runs: uglifyjs jquery + plugins + _main.js → main.min.js
Requirements
- Ruby (with bundler) –
Gemfilespecifiesjekyll ~> 4.3, includeswdmgem for Windows file watching - Node.js (optional) – Only needed for rebuilding
main.min.jsvianpm run build:js _config.dev.yml– Development overrides (localhost URL, analytics disabled); use withbundle exec jekyll serve --config _config.yml,_config.dev.yml
Translation Best Practices
Critical: UTF-8 Diacritics
The #1 issue encountered: Translation agents sometimes output ASCII-safe characters instead of proper UTF-8. This MUST be caught and rejected.
Language-specific checks:
- Spanish: Must have accents – “académicos” NOT “academicos”, “también” NOT “tambien”, “más” NOT “mas”
- French: Must have accents – “réfugiés” NOT “refugies”, “vérités” NOT “verites”, “écrire” NOT “ecrire”
- German: Must have umlauts/eszett – “über” NOT “ueber”, “für” NOT “fuer”, “müssen” NOT “muessen”, “Straße” NOT “Strasse”
- Italian: Must have accents – è, é, à, ò, ù, ì
- Polish: Must have diacritics – ą, ć, ę, ł, ń, ó, ś, ź, ż
- Portuguese: Must have diacritics – ã, õ, ç, á, é, í, ó, ú, â, ê, ô
- Russian: Standard UTF-8 Cyrillic
- Japanese: Standard UTF-8 kanji/hiragana/katakana
- Korean: Standard UTF-8 Hangul
- Turkish: Must have special chars – ğ, ı, ö, ş, ü, ç, İ
- Arabic: Standard UTF-8 Arabic script (RTL)
Quick validation command:
# Check German for ASCII umlauts (should return 0 matches for good files)
grep -c "ueber\|fuer\|muessen\|Laender\|aeuf\|oeff" de/newsletter/*.md
# Check French for missing accents
grep -c "verite\|refugie\|qualifie\|universite" fr/newsletter/*.md
# Check Spanish for missing accents
grep -c " mas \| tambien\|academico\|inmigracion " es/newsletter/*.md
Translation Workflow (3-Agent Voting)
MANDATORY for ALL translation work — including small front page edits. Never skip this workflow even for single-sentence changes.
- Extract source content from Substack to
_translations/source/ - Launch 3 independent translation agents per language (agent_1, agent_2, agent_3)
- Each agent translates all 13 posts independently
- Deliberation agent per language compares all 3 and picks best per post
- Final files written to
{lang}/newsletter/
For front page / about page edits: The same 3-agent voting applies. Launch 3 agents per language, each producing an independent translation of the changed text. A deliberation agent picks the best version. This catches awkward phrasing that a single-pass translation misses.
Translation Quality Rules
- Natural fluency over literal accuracy: If a direct translation sounds awkward or stilted in the target language, use a simpler, more natural synonym. The translated text should read as if originally written in that language.
- Loanwords and cognates: If a concept is commonly expressed using an English loanword in the target language (e.g., “фокус” in Russian for “focus”, “フォーカス” in Japanese), prefer the loanword over a clunky native equivalent.
- Avoid bureaucratic/academic jargon: Prefer everyday equivalents. E.g., in Russian: “демократическую политику” (democratic politics) is better than “демократическую выработку политики” (democratic policy-making process); “поддержать” (support) is better than “принять” (accept) when the meaning is about endorsing rather than receiving.
- Read the sentence aloud: If a translated sentence would sound unnatural spoken aloud to an educated native speaker, rephrase it.
- Sentence structure: Don’t mirror English syntax when the target language has different natural word order. Restructure sentences to flow naturally.
- Consistency check: After translating, compare the translated paragraph against the English source and ask: “Does this convey the same meaning with the same tone, without any phrase that would make a native speaker pause?”
- First sentence of bio pages is LOCKED: The opening sentence of each translated front page (index.md) uses a deliberately simplified form — “professor of migration at the University of Notre Dame” in the native language. Do NOT replace this with a literal translation of the English source (which mentions the Keough School). The simplified form is intentional. Only change it if the user explicitly asks. The pattern is:
[Name] [is a] professor of migration at [University of Notre Dame in native language].- ES: “Alexander Kustov es profesor de migración en la Universidad de Notre Dame.”
- FR: “Alexander Kustov est professeur de migration à l’Université de Notre Dame.”
- PT: “Alexander Kustov é professor de migração na Universidade de Notre Dame.”
- DE: “Alexander Kustov ist Professor für Migration an der Universität Notre Dame.”
- IT: “Alexander Kustov è professore di migrazioni presso la University of Notre Dame.”
- PL: “Alexander Kustov jest profesorem migracji na Uniwersytecie Notre Dame.”
- RU: “Александр Кустов — профессор миграции в Университете Нотр-Дам.”
- JP: “アレクサンダー・クストフはノートルダム大学の移民研究の教授である。”
- KO: “알렉산더 쿠스토프는 노트르담 대학교의 이민 연구 교수이다.”
- TR: “Alexander Kustov, Notre Dame Üniversitesi’nde göç alanında profesördür.”
- AR: “ألكسندر كوستوف أستاذ مشارك في كلية كيو للشؤون العالمية بجامعة نوتردام.”
Translation Rules
- Translate title, subtitle, and body; keep all hyperlinks as English URLs
- Translate footnote content but keep markers
[^1],[^2] - Translate image alt text
- Idioms: substantive equivalent, NOT literal translation
- Name handling: “Alexander Kustov” in Latin script for all languages except Russian (“Aleksandr Kustov”) and Japanese (“Aleksanda Kusutofu” in katakana)
- Preserve all markdown formatting exactly
- Do NOT add or remove content
- Do NOT translate proper nouns unless established translations exist
Publications & Media Translation Format
When translating article/piece titles in publications.md and media.md, use two separate <a> tags — one for the translated title, one for the English original in parentheses:
CORRECT (two links):
<a href="URL">Translated Title</a> (<a href="URL">English Title</a>)
WRONG (single link — causes BiDi rendering issues, especially in Arabic RTL):
<a href="URL">Translated Title (English Title)</a>
For publications:
- Translate article titles; keep author names, journal names, volume/page numbers in English
- Translate filter button labels and group labels
- Translate abstract text and resource labels (“Final Draft” → “Versión Final”, etc.)
- Keep DOIs, URLs, and all
data-*attributes unchanged
For media:
- Translate card titles with English original in parentheses (two
<a>tags as above) - Translate filter labels (Topic/Format) and filter button text
- Translate format tags (Interview, Op-ed, Analysis, etc.)
- Keep outlet names, dates, favicon URLs unchanged
Arabic (RTL) Specific Rules
Arabic pages use dir="rtl" on <html>. Special CSS in assets/css/custom.css handles:
- Sidebar: Flipped to right side (profile pic, book cover on right)
- Events sidebar: Flipped to LEFT side with
left: 0; right: auto;. Content area getsmargin-left: 230pxvia:has(.events-sidebar)to avoid overlap - Publication citations:
direction: ltr; unicode-bidi: isolateto prevent BiDi scrambling of mixed English/Arabic text - Abstracts: Kept in native RTL
- Media cards: Title in RTL, metadata isolated in LTR
- Article-accordion (about page): Citations isolated as LTR, abstracts RTL, expand icon on left
- Book page: Cover floats left, descriptions right-aligned, review borders on right side
- Navigation/masthead: RTL direction
Arabic content preferences (these apply to Arabic only, NOT other languages):
- Publications page: Arabic title on its own line above, then the regular English citation below (same format as the English publications page). The
<span class="ar-title">sits outside/before the<span class="pub-citation">. - About/front page: No “select articles” section. The Arabic about page has only the bio text, no article accordion.
- Book page & press links: Use Arabic transliterations for organization names (e.g., “أمازون” not “Amazon”, “فورين أفيرز” not “Foreign Affairs”) to avoid BiDi misalignment. Reduce English title/publisher font size.
- Ongoing research page: Arabic section headings (right-aligned), but paper titles and author names stay in English (left-aligned).
- English content alignment: All English publication content (citations, resources, media mentions) is LEFT-aligned on Arabic pages, per W3C RTL guidelines and Arab academic journal conventions (foreign refs are left-aligned, Arabic refs are right-aligned). Arabic text (titles, abstracts, headings) remains right-aligned.
- Name, position, and events are NOT translated into Arabic (or any language).
When adding new content to Arabic pages:
- Use Arabic transliterations for organization/publication names wherever possible to minimize BiDi issues
- For mixed Arabic/English text in headings, use
dir="rtl"on the container anddir="ltr"on English<a>tags - Put email addresses on a separate line (
<br>) to avoid BiDi mixing with Arabic text - Test rendering of mixed-direction text — periods, question marks, and parentheses can get misplaced
- Keep the expand/collapse icon on the left side (CSS handles this via
summary::afterposition swap) - ALWAYS visually verify Arabic pages before committing — BiDi issues are not visible in source code
Front Matter Template for Newsletter Posts
---
layout: newsletter-post
title: "TRANSLATED TITLE"
subtitle: "TRANSLATED SUBTITLE"
date: YYYY-MM-DD
permalink: /{lang}/newsletter/{slug}/
lang: {lang}
ref: newsletter-{slug}
original_url: https://alexanderkustov.substack.com/p/{slug}
image: https://substackcdn.com/image/fetch/...
author_profile: false
---
CJK Read Time
The newsletter-post layout has special handling for Japanese read time using number_of_words: "cjk" with a 500 chars/min reading speed (vs 160 words/min for non-CJK).
Common Pitfalls
- Diacritics loss – Always verify UTF-8 characters after translation. Some agents output ASCII-safe substitutes.
- Jekyll
_directories – Directories starting with_are not processed as pages by default (except predefined ones like_posts). The_translations/directory is intentionally excluded. - Substack CDN images – Hero images use Substack CDN URLs. These may break if Substack changes their CDN structure.
- Chrome auto-translate – When testing non-English pages locally, Chrome may auto-translate them back to English, making it appear that translations are broken when they’re actually fine. Check the actual HTML source.
- Language switcher – Dropdown in masthead and inline switcher both read from
site.languagesin_config.yml(centralized). Each page must have bothlangandreffields in front matter for the switcher to appear. - hreflang mapping – The site’s
_includes/hreflang.htmlmapsjptojafor proper SEO. Generic: iterates all pages with matchingref. - Bio style – All non-English bios use third person (“Alexander Kustov is…”); English uses first person (“I am…”).
- Book title format – Use “double parentheses” style: Title (Translated Title) (Publisher, Year).
- AI translation disclosure – Footer includes disclosure for non-EN pages. Newsletter posts also have individual disclosure.
- BiDi rendering – Arabic (RTL) pages require two separate
<a>tags for translated+English titles. Putting both in one<a>tag causes BiDi scrambling of punctuation, spaces, and reading order. - Front matter
langandref– EVERY page (including English source pages) MUST havelang:andref:in front matter, or the language switcher won’t appear. English pages uselang: en. - Events sidebar – Only shown on pages with
show_events: truein front matter (currently only about/index pages). - Navigation order – Media tab comes before Ongoing Research in all language nav blocks.
- Pre-commit verification – ALWAYS preview RTL/Arabic pages visually (screenshot or local serve) before committing. BiDi issues are invisible in source code and can only be caught by rendering the page.
- Email in RTL context – English email addresses (
akustov [at] nd [dot] edu) should be on a separate line in Arabic pages to prevent BiDi mixing with surrounding Arabic text. - Book page RTL – Book cover image floats LEFT (not right) on Arabic pages. Review text borders go on the RIGHT side. Table alignments are flipped.
Seamless Translation Update Workflow
When the user adds or modifies content on the English website and wants translations updated:
IMPORTANT: Always use the 3-Agent Voting workflow (see above) for any translation work, including small text changes on the front page. A single-pass translation frequently produces awkward phrasing that native speakers would notice.
Step 1: Identify Changes
- Compare the updated English file with its translated counterparts
- Identify what’s new, modified, or removed
Step 2: Update All 11 Languages
- Apply the same change to all
{lang}/versions - For new entries (e.g., new publication, new media card):
- Translate the title and any translatable text
- Use two separate
<a>tags:<a href="URL">Translated Title</a> (<a href="URL">English Title</a>) - Keep author names, journal names, URLs,
data-*attributes unchanged - Copy the exact same HTML structure as the English original
Step 3: Verify
- Run diacritics checks for FR, DE, ES, PL, PT (see validation commands above)
- For Arabic: verify two-
<a>-tag structure is used (NOT single-link) - Build site:
bundle exec jekyll build - Check line counts match across languages (media files should all be ~same length)
Step 4: Update Navigation (if new pages added)
- Add entries to ALL 12
main-*blocks in_data/navigation.yml - Add
lang:andref:to English source file front matter - Add
show_events: trueif the page should show events sidebar
Quick Reference: Pages Per Language
Each non-EN language directory ({lang}/) should contain:
index.md– Homepage/about (withshow_events: true)book.md– Book pagenewsletter.md– Newsletter indexnewsletter/{slug}.md– 13 individual newsletter postspublications.md– Published researchmedia.md– Media engagementongoing-research.md– Ongoing researchcv.md– CV page
File Counts
- 13 newsletter source posts in
_translations/source/ - 143 translated newsletter posts (13 posts × 11 languages) in
{lang}/newsletter/ - 11 translated newsletter index pages in
{lang}/newsletter.md - 12 homepage variants (EN + 11 translations)
- 12 book page variants
- 11 publications pages, 11 media pages, 11 ongoing-research pages, 11 CV pages
- Navigation entries in
_data/navigation.ymlfor all 12 languages
Substack Posts (13 total)
| Slug | Title | ~Words | |——|——-|——–| | academics-need-to-wake-up-on-ai-part | Academics Need to Wake Up on AI, Part II | 3,500 | | academics-need-to-wake-up-on-ai | Academics Need to Wake Up on AI | 4,000 | | western-countries-do-not-need-immigration | Western Countries Do Not “Need” Immigration | 3,200 | | student-migration-is-popularuntil | What’s the Matter with Foreign Students? | 3,000 | | reflections-on-the-uncomfortable | Reflections on “The Uncomfortable Truths” | 3,500 | | the-uncomfortable-truths-about-immigration | The Uncomfortable Truths About Immigration | 8,500 | | immigration-is-not-a-thing-that-has | Immigration Is Not One Thing That Has Effects | 3,200 | | why-japan-is-so-uncanny-uncannily | Why Japan Is So Uncanny… Uncannily Normal | 3,500 | | the-immigration-substack-universe | The Immigration Substack Universe | 2,500 | | do-people-like-refugees-more-than | Do People Like Refugees more than Economic Immigrants? | 3,000 | | why-dont-you-house-them-yourself | “Why Don’t You House Them Yourself?” | 3,000 | | why-skilled-migration-is-popular | Why Skilled Migration Is Popular | 3,000 | | welcome-to-popular-by-design | Welcome to “Popular by Design” | 800 |
