System Architecture Overview
SisterShield is a monolithic Next.js 14 application using the App Router. All server and client code lives in a single repository. The architecture follows a layered design: browser client, Next.js server (pages + API routes), data layer (Prisma ORM + PostgreSQL with pgvector), AI services (LLM providers, embeddings, image generation), a RAG pipeline for evidence-based content, a Knowledge Graph for curriculum mapping, and a Twine pipeline for interactive story building.
System Architecture
RAG + Knowledge Graph Data Flow
The following diagram shows how educational documents flow through the RAG pipeline into the knowledge base, and how that knowledge is used during story generation.
Component Architecture
The UI is organized into three layers.
Layout Components
Located in src/components/layout/, these provide the application shell:
- Header — top navigation bar with logo, nav links, locale switcher, Quick Exit, and Get Help buttons.
- Sidebar — dashboard navigation (role-aware, showing different links for Students vs Teachers).
- Footer — site footer with links and safety resources.
- LocaleSwitcher — dropdown to toggle between Korean and English.
Page Components
Located in src/app/[locale]/, these are Next.js route-based components:
(auth)/loginand(auth)/register— authentication pages.(dashboard)/dashboard— role-based home screen.(dashboard)/courses/*— course listing, detail, edit, generate, and play pages.(dashboard)/submissions/*— submission listing, detail, creation, and review pages.(dashboard)/progress— student progress overview.(dashboard)/settings— user settings.page.tsx(root) — public landing page with hero section and evidence.pilot/— pilot program request page.
Shared UI Components
Located in src/components/:
ui/— shadcn/ui primitives (Button, Dialog, Tabs, Toast, Select, Progress, etc.).twine/TwinePlayer.tsx— iframe-based Twine story renderer with postMessage communication.safety/QuickExit.tsx— emergency exit button (Escape key, clears session, redirects to weather.com).safety/GetHelp.tsx— dialog with Korean and international crisis resources.courses/CourseCard.tsx— card component for course listings.courses/ImagePromptPanel.tsx— UI for AI image generation workflow.hero/— landing page components (evidence carousel, stats).
Data Flow Overview
Request Lifecycle
- Browser sends request to Next.js server.
- Middleware detects locale from URL path (
/en-US/dashboardor/ko-KR/dashboard) and loads translations. - NextAuth middleware verifies JWT token from cookie.
- Server component or API route executes, querying PostgreSQL through Prisma.
- Response rendered (RSC for pages, JSON for API routes).
Twine Pipeline
The Twine pipeline converts interactive story content into playable, tracked course HTML:
- Parse — JSDOM extracts
tw-storydata/tw-passagedatafrom Twine HTML; detects format (Harlowe, SugarCube). - Validate — BFS traversal detects dead links, duplicate passages, and orphan passages.
- Compile — Tweego CLI compiles Twee 3 source to Harlowe-3 HTML (or custom renderer as fallback).
- Inject Tracking — MutationObserver script for passage change detection, progress calculation, quiz scoring, and postMessage communication with the parent frame.
- Build — Orchestrates compile + inject, stores build HTML at
builds/{courseId}/v{version}/index.htmland Twee source atsources/{courseId}/v{version}/source.twee.
RAG Pipeline
The RAG pipeline provides evidence-based context for AI-generated stories:
- Ingest — Scans
RAG/Data/for PDFs and DOCX files; extracts text via pdf-parse / mammoth; detects category and source organization. - Chunk — Splits documents into 500-800 token segments with 100-token overlap; respects section headers and paragraph boundaries.
- Embed — Generates 1536-dimensional vectors via OpenAI
text-embedding-3-smallin batches of 100. - Store — Upserts
RagDocumentandRagChunkrecords into PostgreSQL with pgvector; builds IVFFlat index. - Search — Hybrid retrieval combining cosine similarity, keyword matching, and Reciprocal Rank Fusion (RRF); quality scoring boosts high-value chunks.
- Format — Groups top-K chunks by document, assigns citation keys
[S1],[S2], etc., and injects asEVIDENCE_CONTEXTinto the LLM prompt. - Cite — Extracts
[SOURCE:Sn]markers from generated Twee, maps toRagChunk/RagDocumentmetadata, and storesRagCitationrecords.
Knowledge Graph
The Knowledge Graph organizes the RAG knowledge base into a structured curriculum taxonomy:
- KnowledgeConcept — Hierarchical tree of concepts (categories: risk-type, prevention-strategy, legal-framework, coping-skill) with bilingual names.
- Concept Tagger — Automatically tags
RagChunkrecords with matching concepts via keyword matching and embedding similarity. - RagConceptTag — Many-to-many links between chunks and concepts, enabling concept-based retrieval filtering and curriculum gap analysis.
LLM Integration
The src/lib/llm/client.ts module abstracts LLM calls behind a callLLM() function. The LLM_PROVIDER environment variable switches between anthropic (Claude Sonnet) and openai (GPT-4o). Key LLM-powered features:
- Story Generation — Generates Twee 3 interactive stories with RAG-sourced evidence, following a structured prompt with protagonist design, dangerous-choice architecture, quiz structure, and resource passages.
- Translation — Three modes: single-field, batch metadata, and full Twee source translation (preserving passage structure and link targets).
- Error Fixing — Auto-repairs dead links, duplicate passages, and orphans using RAG-retrieved reference patterns from approved stories.
- Image Prompts — Generates DALL-E 3 prompts with a unified style directive and fixed character roster; all characters depicted as 18+ with safety constraints.
Image generation always uses OpenAI DALL-E 3 regardless of the text LLM provider.
Database Models
The Prisma schema defines the following core models:
| Model | Purpose |
|---|---|
User | Registered users with role (STUDENT, TEACHER, ADMIN) |
Account | OAuth account linking (NextAuth adapter) |
Session | Database-backed sessions (NextAuth adapter) |
Submission | Twine file uploads with review workflow |
Course | Published courses with bilingual metadata |
CourseVersion | Versioned course builds (HTML with tracking) |
Progress | Per-user, per-course progress tracking |
HeroEvidenceItem | Landing page evidence quotes |
PilotRequest | Pilot program interest submissions |
TeacherAccessLog | Audit trail for teacher actions |
RagDocument | Ingested source documents (PDF, DOCX) with category and status |
RagChunk | Document chunks with 1536-d embeddings for vector search |
KnowledgeConcept | Hierarchical concept taxonomy with bilingual names |
RagConceptTag | Many-to-many links between chunks and concepts |
RagCitation | Course-to-chunk citation tracking for source attribution |
Internationalization Data Flow
All user-facing database fields use a JSON i18n pattern:
{ "en-US": "English text", "ko-KR": "Korean text"}The Prisma Locale enum uses underscores (en_US, ko_KR), while the application uses hyphens (en-US, ko-KR). Helper functions dbLocaleToI18n() and i18nLocaleToDb() in src/lib/i18n/config.ts handle the conversion.
File Storage
Twine HTML files (uploaded and built), Twee sources, and generated images are stored on the local filesystem. The storage layer in src/lib/storage/ provides an abstraction interface designed for future migration to S3 or MinIO. Files are served through /api/files/[...path].
| Path | Contents |
|---|---|
uploads/ | Raw Twine HTML submissions |
builds/{courseId}/v{version}/ | Compiled course HTML with tracking |
sources/{courseId}/v{version}/ | Twee 3 source files |
images/{courseId}/ | DALL-E 3 generated artwork |