EduBench

EduBench

External benchmark for comprehensive educational assessment

v1.0.0

Overview

External evaluation methodology providing broad assessment across 6 educational task types.

⚠️ Note: EduBench is an external methodology and is not representative of InceptBench's evaluation approach. It provides complementary metrics but follows different evaluation principles.

Key Features

Task Coverage
6 Types
QA | EC | IP | AG | QG | TMG
Speed
~8s per question
Methodology
External
Not Incept-aligned
Scoring
0-10
Broad assessment

Scoring Scale:

  • ●●●●●●●●●● (9-10): Excellent
  • ●●●●●●●●○○ (7-8): Good
  • ●●●●●○○○○○ (5-6): Fair
  • ●●○○○○○○○○ (3-4): Poor
  • ○○○○○○○○○○ (0-2): Failing

Output

Simplified:

{"external_edubench": {"overall_score": 8.2}}

Full mode:

{
  "external_edubench": {
    "overall_score": 8.2,
    "task_scores": {
      "qa": 9.0,
      "ec": 8.5,
      "ip": 7.8,
      "ag": 8.0,
      "qg": 8.5,
      "tmg": 7.5
    }
  }
}

Best Use Cases

  • External Comparison: Compare content against external educational standards
  • Research & Development: Academic assessment of content generation approaches
  • Multi-dimensional view: Understand quality across 6 different task types
  • Complementary metrics: Use alongside InceptBench evaluators for additional perspective

Note: For production QA, prioritize pedagogy-grounded InceptBench evaluators.

Usage

curl -X POST "https://api.inceptapi.com/evaluate" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer INCEPT_API_KEY" \
  -d '{
    "evaluators_to_run": ["external_edubench"],
    "generated_questions": [...]
  }'

Performance: ~8s per question | Languages: Multiple

Input Schema: Generated Question Schema → | Request Schema →

Recommended: For production QA, use pedagogy-grounded evaluators: TI Question QA | Math Content | Answer Verification | Reading QC

See full documentation →