EduBench
External benchmark for comprehensive educational assessment
v1.0.0
Overview
External evaluation methodology providing broad assessment across 6 educational task types.
⚠️ Note: EduBench is an external methodology and is not representative of InceptBench's evaluation approach. It provides complementary metrics but follows different evaluation principles.
Key Features
Task Coverage
QA | EC | IP | AG | QG | TMG
Speed
~8s per question
Methodology
Not Incept-aligned
Scoring
Broad assessment
Scoring Scale:
- ●●●●●●●●●● (9-10): Excellent
- ●●●●●●●●○○ (7-8): Good
- ●●●●●○○○○○ (5-6): Fair
- ●●○○○○○○○○ (3-4): Poor
- ○○○○○○○○○○ (0-2): Failing
Output
Simplified:
{"external_edubench": {"overall_score": 8.2}}
Full mode:
{
"external_edubench": {
"overall_score": 8.2,
"task_scores": {
"qa": 9.0,
"ec": 8.5,
"ip": 7.8,
"ag": 8.0,
"qg": 8.5,
"tmg": 7.5
}
}
}
Best Use Cases
- External Comparison: Compare content against external educational standards
- Research & Development: Academic assessment of content generation approaches
- Multi-dimensional view: Understand quality across 6 different task types
- Complementary metrics: Use alongside InceptBench evaluators for additional perspective
Note: For production QA, prioritize pedagogy-grounded InceptBench evaluators.
Usage
curl -X POST "https://api.inceptapi.com/evaluate" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer INCEPT_API_KEY" \
-d '{
"evaluators_to_run": ["external_edubench"],
"generated_questions": [...]
}'
Performance: ~8s per question | Languages: Multiple
Input Schema: Generated Question Schema → | Request Schema →
Recommended: For production QA, use pedagogy-grounded evaluators: TI Question QA | Math Content | Answer Verification | Reading QC