Project

CEFR.ai

Language difficulty modelling

Builds interpretable CEFR-level predictors from lexical, syntactic, and semantic features.

Status

Model tuning

Dataset size

96k labeled passages

Last update

2026-03-08

Objective

Ship adaptive language products with measurable difficulty calibration and feedback loops.

Revenue Metrics

Monthly revenue

$1,280.00

Direct spend

$18.00

Total spend

$44.08

Net monthly

$1,235.92

Datasets

CEFR-Aligned Text Benchmark

Multi-domain passages with CEFR labels, readability metrics, and metadata.

96,340 records Daily refresh Tuning labels 2026-03-08

Recent Experiments

2026-03-08

Prompt-augmented CEFR classification on noisy learner text

Reduced borderline A2/B1 misclassification by 12%.

Open code ->

2026-03-03

Lexical rarity feature ablation for CEFR bands

Feature had high impact on B2/C1 separation; kept in stack.