melune — a patient tutor for spoken language

about

What melune is.

The unit of practice is a conversation. You sit down for fifteen minutes, speak with the tutor in your target language, and end the session. The tutor listens patiently, asks follow-ups, and remembers what you've said in previous sessions — your job, your family, the trip you took last month, the show you're watching.

Afterwards, a written replay opens. Each of your turns has any grammatical corrections attached inline, with a short explanation of why. Pronunciation notes use IPA and sit alongside the audio you actually produced. New vocabulary the tutor used gets queued for review later.

Over time, melune builds a picture of where you tend to slip — which case you keep getting wrong, which vowel you keep collapsing — and the next day's session is shaped around that. The recommendation comes with its reasoning visible: which categories it's targeting and why it picked them.

approach

Three things I care about.

These shape almost every design decision in the app. They're more useful than a feature list because they explain what gets left out, too.

Conversation first

Spoken practice is the central activity. Drills, vocabulary, and reading exist to support what you do when you're actually talking.

ii.

Show the work

Corrections explain why, not just what. The proficiency estimate shows its confidence. When the curriculum picks today's session, the reasoning is right there.

iii.

Calm, not loud

No streaks, no XP, no badges, no mascot. Stats are for reflection, not pressure. Long-form feedback is treated like prose, not chat.

where it is

What's working today.

Honest snapshot of where the build is. Items marked working are wired end-to-end in internal alpha; in progress means the engine is done but the UI is being rebuilt; planned means designed but not yet started.

● Working
Live voice conversation Real-time spoken dialogue with the tutor in your target language. Per-turn audio captured for replay, pause and resume the mic without ending the session.
● Working
Written session replay After each session, a typeset transcript opens with grammatical corrections, phoneme-level pronunciation notes, and a short summary of themes and highlights.
● Working
Adaptive placement test A three-phase test (written items, spoken items, free conversation) that adjusts difficulty as it goes. Produces a proficiency estimate with a confidence band and a per-category map of weak spots.
● Working
Adaptive curriculum "Today's session" is computed from your current weakness profile. Each plan shows its targets, an i+1 stretch goal, and the reasoning behind the choice.
● Working
Per-language memory The tutor maintains a running, editable understanding of who you are and uses it to personalize sessions. You can see exactly what's stored about you and remove or edit anything.
● Working
Vocabulary deck Cards auto-populate from real tutor turns above your level. An "I knew this" bulk action clears the unconfirmed queue; the review loop runs on FSRS scheduling. The card editor is being redesigned.
● Working
Drills Five to ten focused items in a row, each targeting one grammar or pronunciation feature drawn from your weakness profile. Voice-graded, with immediate per-item feedback.
● Working
Listening & reading modes At-level comprehensible-input monologues, sentence-by-sentence shadowing with pronunciation feedback, and a paste-any-text reader with tap-for-translation and adjustable speech rate.
○ Planned
Pragmatics as a first-class target Speech acts, register, honorifics, and modal particles tracked as their own learning track — shown on a slower cadence than grammar, because the research suggests pragmatic competence accrues through sustained attention rather than weekly wins.
○ Planned
Managed connection An optional subscription that handles the API for you, so you don't have to manage your own key. Conversation minutes will be capped — not unlimited — because spaced practice beats marathon practice for durable gains.
○ Planned
More languages The data model, taxonomy, and analysis prompts are already language-agnostic. German and Korean are the current tracks; Japanese, Mandarin, and Spanish are next.
○ Planned
iOS companion Read-only at first — browse past replays, review the deck, listen to comprehensible input on the go. Live conversation stays on the desktop until it can be done properly on a phone.

method

Where the research shows.

A few decisions worth flagging — places where the literature pushed back against what would have been easier or more profitable to ship.

Pronunciation feedback is reserved for scripted practice. Shadowing, drills, and read-aloud get phoneme-level analysis with IPA. Free conversation, scenarios, and curriculum sessions don't. The L2 pronunciation literature is consistent that instruction transfers to controlled production but not to spontaneous speech, and forced-aligners degrade on spontaneous audio. So melune spends pronunciation effort where it actually pays off, and keeps conversation focused on grammar, vocabulary, pragmatics, and fluency.

Sessions are capped near fifteen minutes. Distributed practice beats marathon practice for durable gains across syntax, morphology, fluency, and pronunciation. Four to six short sessions per week is the research-aligned shape. The session cap is treated as a learning feature, not a platform limit — and the same logic will shape any future subscription tier.

The level label moves slowly. An internal proficiency estimate updates after every session, because that's what the curriculum needs to plan tomorrow. But the level you see is a slower, hedged display — with a confidence band and the evidence behind it shown — so a single noisy session doesn't shove your level around.

practicalities

A few things worth knowing.

macOS first. melune is built natively in SwiftUI for the desktop. Sessions happen at a real keyboard with real audio hardware, which is where careful practice tends to happen anyway. An iOS companion will follow.

Bring your own API key, for now. The first builds connect directly to Google's Gemini API using a key you provide. At fifteen minutes of conversation a day plus a few drill or shadowing sessions a week, that runs to roughly $11–14 a month at current API prices. A managed-connection subscription will follow, so you won't have to handle the key yourself.

Two languages at the start. German and Korean are the initial target languages — chosen because they exercise different parts of the architecture (case morphology in one, speech levels and an honorific system in the other). More languages come once those two feel right.