InceptBench Evaluators

InceptBench

Educational content evaluation framework grounded in Incept pedagogy. InceptBench provides one unified evaluator that automatically selects the appropriate evaluation methods based on your content characteristics.

How It Works

Simply provide your educational content along with optional routing parameters (subject, grade, type), and InceptBench automatically determines the best evaluation approach. You don’t need to worry about which internal evaluation methods to run—that’s handled automatically.

Content Types Supported

Questions: MCQ (Multiple Choice) and Fill-in questions
Text Content: Educational passages, explanations, and text materials
Articles: Complete educational documents with markdown formatting, mixed media, and embedded questions (NEW in v1.4.0)
Visual Content: Images accompanying questions or standalone educational images

All content types are evaluated for pedagogical value, accuracy, grade alignment, and Direct Instruction compliance. Images are automatically detected and evaluated when image_url is provided.

Quick Start

Evaluate educational content using the InceptBench API endpoint:

# Basic evaluation - automatic routing
curl -X POST "https://api.inceptapi.com/evaluate" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer INCEPT_API_KEY" \
  -d @qs.json

# With subject and grade for better routing
curl -X POST "https://api.inceptapi.com/evaluate?subject=math&grade=6-8" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer INCEPT_API_KEY" \
  -d @qs.json

# Full detailed results
curl -X POST "https://api.inceptapi.com/evaluate?subject=ela&verbose=true" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer INCEPT_API_KEY" \
  -d @qs.json

Replace INCEPT_API_KEY with your actual API key and qs.json with your input file path.

Routing Parameters

Provide these optional parameters to help InceptBench select the most appropriate evaluation methods:

subject: Subject area (math, ela, science, social-studies, general)
grade: Grade level (e.g., "K", "3", "6-8", "9-12")
type: Content type (mcq, fill-in, short-answer, essay, text-content, passage)
verbose: Set to true for full detailed evaluation (default: simplified scores only)

Evaluation Methods

InceptBench uses multiple specialized evaluation methods under the hood, automatically selected based on your content type and routing parameters. You don’t need to specify which methods to use—the system handles this automatically.

Technical Reference: InceptBench uses these specialized evaluation methods under the hood. They are automatically selected based on your content type and routing parameters—you don't need to configure them manually. Click any card to learn more about the technical details.

TI Question QA

Internal quality assessment across 10 dimensions

Correctness & grade alignment
DI compliance & pedagogical value
Format & instruction adherence
Language quality & clarity
Detailed recommendations

Math Content Evaluator

Comprehensive content quality across 9 criteria

Curriculum alignment validation
Cognitive demand assessment
Accuracy & rigor checks
Pedagogical design evaluation
Clarity & accessibility scoring

Answer Verification

Fast, independent answer correctness validation

Lightning-fast verification
Independent AI validation
Confidence scoring (0-10)
Reasoning explanations
Works across all subjects

Reading Question QC

Specialized MCQ quality and distractor analysis

Distractor quality analysis
Grammatical consistency checks
Answer plausibility scoring
Question clarity assessment
Standards alignment verification

Text Content Evaluator

Pedagogical assessment for educational text

Correctness & factual accuracy
Grade alignment & clarity
DI compliance evaluation
Pedagogical value scoring
Accept/revise/reject recommendations

How Evaluation Methods Are Selected

The evaluator automatically routes your content to the appropriate evaluation methods:

Math Questions → Quality assessment + Answer verification + Math content evaluation
ELA/Reading Questions → Quality assessment + Answer verification + Reading QC + Distractor analysis
Text Content → Text content evaluation + Subject-specific assessment
Articles → Holistic article evaluation + Component evaluation (text, questions, images) (NEW in v1.4.0)
Content with Images → DI image quality evaluation (automatically enabled)
General Content → Core quality assessment + Content-appropriate specialized evaluation

These methods run in parallel for fast evaluation, and their scores are combined into a final quality score (0-1 scale).

New in v1.4.0: Article holistic evaluation assessing complete educational documents as unified pedagogical experiences.

New in v1.3.0: Automatic image detection and evaluation using Direct Instruction rubric-based scoring.

Technical Reference

For developers who need technical details about the evaluation methods:

Quality Assessment: 10-dimension pedagogical quality scoring (correctness, grade alignment, DI compliance, etc.)
Answer Verification: AI-powered correctness validation with confidence scoring
Reading QC: MCQ distractor quality analysis and passage alignment checks
Math Content Evaluation: Curriculum alignment, cognitive demand, and rigor assessment across 9 criteria
Text Content Evaluation: Pedagogical assessment of explanatory text across 8 dimensions
Article Holistic Evaluation (NEW v1.4.0): Unified pedagogical experience assessment across 10 dimensions (pedagogical coherence, content organization, scaffolding quality, engagement, mixed-media integration, learning objectives clarity, grade appropriateness, completeness, cognitive load management, instructional clarity)
Image Quality DI Evaluation (NEW v1.3.0): DI rubric-based pedagogical image quality assessment (0-100 scale, auto-enabled for images)
Math Image Judge (NEW v1.3.0): Vision-based image quality checking using Claude (PASS/FAIL)

Image Evaluation Features:

Automatic detection when image_url is present
Context-aware evaluation (accompaniment vs standalone modes)
DI rubric scoring with weighted criteria (visual clarity, pedagogical value, age-appropriateness, canonical representation)
Hard-fail gates for inappropriate content or answer leakage

Note: The specific evaluation methods used are implementation details and may be enhanced over time. Always use the routing parameters (subject, grade, type) rather than trying to manually configure evaluation methods.

Input Format

InceptBench supports three types of educational content:

generated_questions - MCQ and fill-in questions (traditional) - Schema →
generated_content - Educational text, passages, explanations - Schema →
generated_articles - Complete educational documents with markdown formatting (NEW in v1.4.0) - Schema →

You can provide one or more types in the same request. See full Request Schema → and Response Schema → in the glossary.

{
  "subject": "math",
  "grade": "6",
  "type": "mcq",
  "generated_questions": [
    {
      "id": "q1",
      "type": "mcq",
      "question": "إذا كان ثمن 2 قلم هو 14 ريالًا، فما ثمن 5 أقلام بنفس المعدل؟",
      "answer": "35 ريالًا",
      "answer_explanation": "الخطوة 1: تحليل المسألة — لدينا ثمن 2 قلم وهو 14 ريالًا. نحتاج إلى معرفة ثمن 5 أقلام بنفس المعدل. يجب التفكير في العلاقة بين عدد الأقلام والسعر وكيفية تحويل عدد الأقلام بمعدل ثابت.\nالخطوة 2: تطوير الاستراتيجية — يمكننا أولًا إيجاد ثمن قلم واحد بقسمة 14 ÷ 2 = 7 ريال، ثم ضربه في 5 لإيجاد ثمن 5 أقلام: 7 × 5 = 35 ريالًا.\nالخطوة 3: التطبيق والتحقق — نتحقق من منطقية الإجابة بمقارنة السعر بعدد الأقلام. السعر يتناسب طرديًا مع العدد، وبالتالي 35 ريالًا هي الإجابة الصحيحة والمنطقية.",
      "answer_options": {
        "A": "28 ريالًا",
        "B": "70 ريالًا",
        "C": "30 ريالًا",
        "D": "35 ريالًا"
      },
      "skill": {
        "title": "Grade 6 Mid-Year Comprehensive Assessment",
        "grade": "6",
        "subject": "mathematics",
        "difficulty": "medium",
        "description": "Apply proportional reasoning, rational number operations, algebraic thinking, geometric measurement, and statistical analysis to solve multi-step real-world problems",
        "language": "ar"
      },
      "image_url": null,
      "additional_details": "🔹 **Question generation logic:**\nThis question targets proportional reasoning for Grade 6 students, testing their ability to apply ratios and unit rates to real-world problems. It follows a classic proportionality structure — starting with a known ratio (2 items for 14 riyals) and scaling it up to 5 items. The stepwise reasoning develops algebraic thinking and promotes estimation checks to confirm logical correctness.\n\n🔹 **Personalized insight examples:**\n- Choosing 28 ريالًا shows a misunderstanding by doubling instead of proportionally scaling.\n- Choosing 7 ريالًا indicates the learner found the unit rate but didn't scale it up to 5.\n- Choosing 14 ريالًا confuses the given 2-item cost with the required 5-item cost.\n\n🔹 **Instructional design & DI integration:**\nThe question aligns with *Percent, Ratio, and Probability* learning targets. In DI format 15.7, it models how equivalent fractions and proportional relationships can predict outcomes across different scales. This builds foundational understanding for probability and proportional reasoning. By using a simple, relatable context (price of pens), it connects mathematical ratios to practical real-world applications, supporting concept transfer and cognitive engagement."
    }
  ],
  "verbose": false
}

{
  "subject": "math",
  "grade": "6",
  "type": "text-content",
  "generated_content": [
    {
      "id": "text1",
      "type": "text",
      "title": "Understanding Proportional Reasoning",
      "content": "Proportional reasoning is a fundamental mathematical skill that involves understanding relationships between quantities. When two quantities maintain a constant ratio, they are said to be proportional. For example, if 2 pens cost 14 riyals, we can find the cost of any number of pens by maintaining this same ratio.\n\nThe key to solving proportional problems is identifying the unit rate - the cost or quantity per one item. In our pen example, dividing 14 by 2 gives us 7 riyals per pen. Once we know the unit rate, we can multiply it by any quantity to find the total cost.\n\nThis concept appears throughout mathematics and real life: in cooking recipes, map scales, speed calculations, and currency conversions. Understanding proportional reasoning helps students develop algebraic thinking and prepares them for more advanced mathematical concepts.",
      "skill": {
        "title": "Proportional Reasoning Concepts",
        "grade": "6",
        "subject": "mathematics",
        "difficulty": "medium",
        "language": "en"
      },
      "image_url": null,
      "additional_details": "This explanatory text provides conceptual foundation for proportional reasoning problems, building understanding before procedural practice."
    }
  ],
  "verbose": false
}

{
  "subject": "biology",
  "grade": "7",
  "type": "article",
  "generated_articles": [
    {
      "type": "article",
      "content": "# Introduction to Photosynthesis\n\nPhotosynthesis is the process by which plants convert light energy into chemical energy. This fundamental biological process sustains nearly all life on Earth.\n\n## What Plants Need\n\nPlants require three key ingredients for photosynthesis:\n- **Sunlight** - Energy source\n- **Water** - Hydrogen source\n- **Carbon dioxide** - Carbon source\n\n![diagram-of-plant-absorbing-light.png](https://example.com/plant-diagram.png)\n\n### Question 1\nWhat do plants need for photosynthesis?\nA) Water and sunlight only\nB) Water, sunlight, and carbon dioxide\nC) Only sunlight\nD) Only water\n\n**Correct Answer:** B\n**Explanation:** Plants need all three components - water, sunlight, and carbon dioxide - to perform photosynthesis successfully.\n\n## The Chemical Process\n\nDuring photosynthesis, plants use chlorophyll to capture sunlight. This energy drives a chemical reaction that combines water (H₂O) and carbon dioxide (CO₂) to produce glucose (C₆H₁₂O₆) and oxygen (O₂).\n\nThe simplified equation is:\n6CO₂ + 6H₂O + light energy → C₆H₁₂O₆ + 6O₂\n\n### Question 2\nWhat gas do plants release during photosynthesis?\nA) Carbon dioxide\nB) Nitrogen\nC) Oxygen\nD) Hydrogen\n\n**Correct Answer:** C\n**Explanation:** Plants release oxygen as a byproduct of photosynthesis, which is essential for animal life.",
      "skill": {
        "title": "Introduction to Photosynthesis",
        "subject": "Biology",
        "grade": "7",
        "difficulty": "medium",
        "language": "en"
      },
      "title": "Introduction to Photosynthesis"
    }
  ],
  "verbose": false
}

{
  "subject": "math",
  "grade": "6-7",
  "generated_questions": [
    {
      "id": "q1",
      "type": "mcq",
      "question": "If the price of 2 pens is 14 riyals, what is the price of 5 pens at the same rate?",
      "answer": "35 riyals",
      "answer_explanation": "Step 1: Analyze - We know 2 pens cost 14 riyals...",
      "answer_options": {
        "A": "28 riyals",
        "B": "70 riyals",
        "C": "30 riyals",
        "D": "35 riyals"
      },
      "skill": {
        "title": "Grade 6 Proportional Reasoning",
        "grade": "6",
        "subject": "mathematics",
        "difficulty": "medium",
        "language": "en"
      }
    },
    {
      "id": "q2",
      "type": "fill-in",
      "question": "What is the value of x in the equation 3x + 7 = 22?",
      "answer": "5",
      "answer_explanation": "Step 1: Start with the equation 3x + 7 = 22.\nStep 2: Subtract 7 from both sides: 3x = 15.\nStep 3: Divide both sides by 3: x = 5.\nStep 4: Verify by substituting back: 3(5) + 7 = 15 + 7 = 22 ✓",
      "skill": {
        "title": "Solving Linear Equations",
        "grade": "7",
        "subject": "mathematics",
        "difficulty": "medium",
        "language": "en"
      },
      "image_url": null,
      "additional_details": "This question assesses understanding of algebraic manipulation and inverse operations. Students must demonstrate the ability to isolate variables through systematic steps."
    }
  ],
  "generated_content": [
    {
      "id": "text1",
      "type": "text",
      "title": "Understanding Proportional Reasoning",
      "content": "Proportional reasoning is a fundamental mathematical skill that involves understanding relationships between quantities. When two quantities maintain a constant ratio, they are said to be proportional. For example, if 2 pens cost 14 riyals, we can find the cost of any number of pens by maintaining this same ratio.\n\nThe key to solving proportional problems is identifying the unit rate - the cost or quantity per one item. In our pen example, dividing 14 by 2 gives us 7 riyals per pen. Once we know the unit rate, we can multiply it by any quantity to find the total cost.\n\nThis concept appears throughout mathematics and real life: in cooking recipes, map scales, speed calculations, and currency conversions. Understanding proportional reasoning helps students develop algebraic thinking and prepares them for more advanced mathematical concepts.",
      "skill": {
        "title": "Proportional Reasoning Concepts",
        "grade": "6",
        "subject": "mathematics",
        "difficulty": "medium",
        "language": "en"
      },
      "image_url": null,
      "additional_details": "This explanatory text provides conceptual foundation for proportional reasoning problems, building understanding before procedural practice."
    },
    {
      "id": "passage1",
      "type": "passage",
      "title": "The History of Algebra",
      "content": "The word 'algebra' comes from the Arabic word 'al-jabr,' which means 'reunion of broken parts.' This mathematical discipline was formalized by the Persian mathematician Muhammad ibn Musa al-Khwarizmi in the 9th century. His book, 'The Compendious Book on Calculation by Completion and Balancing,' introduced systematic methods for solving linear and quadratic equations.\n\nAl-Khwarizmi's work built upon earlier contributions from Babylonian, Greek, and Indian mathematicians. However, his systematic approach and notation made algebra accessible to a wider audience. The methods he developed for solving equations - moving terms from one side to another and combining like terms - are still taught in classrooms today.\n\nAlgebra revolutionized mathematics by introducing the use of symbols to represent unknown quantities. This abstraction allowed mathematicians to solve general problems rather than specific numerical cases. Today, algebra serves as the foundation for advanced mathematics, science, engineering, and technology.",
      "skill": {
        "title": "History of Mathematics",
        "grade": "7",
        "subject": "mathematics",
        "difficulty": "medium",
        "language": "en"
      },
      "image_url": null,
      "additional_details": "Historical context passage that connects mathematical concepts to cultural and historical development, supporting interdisciplinary learning."
    },
    {
      "id": "text2",
      "type": "explanation",
      "title": "Why We Isolate Variables",
      "content": "When solving equations like 3x + 7 = 22, our goal is to isolate the variable (x) on one side of the equation. But why is this important?\n\nIsolating the variable reveals its value - the number that makes the equation true. Think of an equation as a balanced scale. Whatever we do to one side, we must do to the other to maintain balance. This principle, called the 'golden rule of algebra,' ensures our solution is valid.\n\nThe process follows a logical sequence: First, we undo addition or subtraction (in this case, subtract 7 from both sides). Then, we undo multiplication or division (divide both sides by 3). This order follows the reverse of the order of operations, systematically 'unwrapping' the variable.\n\nUnderstanding this process builds problem-solving skills that extend beyond mathematics. It teaches logical thinking, systematic approaches to complex problems, and the importance of maintaining equivalence - principles applicable to many real-world situations.",
      "skill": {
        "title": "Solving Equations Conceptually",
        "grade": "7",
        "subject": "mathematics",
        "difficulty": "medium",
        "language": "en"
      },
      "image_url": null,
      "additional_details": "Conceptual explanation that builds deep understanding of why algebraic procedures work, not just how to perform them."
    }
  ],
  "verbose": false
}

Output Format

Simplified (Default, `verbose: false`)

{
  "request_id": "06c031fd-6517-4874-8117-2dbeb5554291",
  "evaluations": {
    "q1": {
      "ti_question_qa": {
        "overall": 0.911
      },
      "answer_verification": {
        "is_correct": true
      },
      "reading_question_qc": {
        "overall_score": 0.8
      },
      "math_content_evaluator": {
        "overall_score": 1.0
      },
      "final_score": 0.904
    },
    "q2": {
      "ti_question_qa": {
        "overall": 0.933
      },
      "answer_verification": {
        "is_correct": true
      },
      "reading_question_qc": {
        "overall_score": 0.778
      },
      "math_content_evaluator": {
        "overall_score": 0.778
      },
      "final_score": 0.830
    },
    "text1": {
      "math_content_evaluator": {
        "overall_score": 0.778
      },
      "text_content_evaluator": {
        "overall": 0.957
      },
      "final_score": 0.867
    },
    "text2": {
      "math_content_evaluator": {
        "overall_score": 0.778
      },
      "text_content_evaluator": {
        "overall": 0.957
      },
      "final_score": 0.867
    },
    "article1": {
      "article_holistic_evaluator": {
        "overall": 0.835,
        "recommendation": "accept"
      },
      "text_content_evaluator": {
        "overall": 0.89
      },
      "embedded_questions": {
        "q1": {
          "ti_question_qa": {"overall": 0.87}
        },
        "q2": {
          "ti_question_qa": {"overall": 0.90}
        }
      },
      "images": {
        "img1": {
          "image_quality_di_evaluator": {"overall": 0.95}
        }
      },
      "final_score": 0.880
    }
  },
  "evaluation_time_seconds": 38.76
}

Note: Evaluation methods automatically apply based on content type and routing parameters. The system intelligently selects appropriate methods—you don’t need to configure them manually.

Full Mode (`verbose: true`)

Returns detailed scores, issues, strengths, and recommendations for each evaluator.

{
  "request_id": "06c031fd-6517-4874-8117-2dbeb5554291",
  "evaluations": {
    "q1": {
      "ti_question_qa": {
        "overall": 0.911,
        "scores": {
          "correctness": 1.0,
          "grade_alignment": 0.9,
          "difficulty_alignment": 0.9,
          "language_quality": 0.85,
          "pedagogical_value": 0.95,
          "explanation_quality": 0.9,
          "instruction_adherence": 0.9,
          "format_compliance": 1.0,
          "query_relevance": 1.0,
          "di_compliance": 0.9
        },
        "issues": [],
        "strengths": ["Clear scaffolded explanation", "Excellent proportional reasoning"],
        "recommendation": "accept",
        "suggested_improvements": [],
        "di_scores": {...},
        "section_evaluations": {...}
      },
      "answer_verification": {
        "is_correct": true,
        "correct_answer": "35 riyals",
        "confidence": 10,
        "reasoning": "The answer correctly applies proportional reasoning..."
      },
      "reading_question_qc": {
        "overall_score": 0.8,
        "distractor_checks": {...},
        "question_checks": {...},
        "passed": true
      },
      "math_content_evaluator": {
        "overall_score": 1.0,
        "overall_rating": "SUPERIOR",
        "curriculum_alignment": "PASS",
        "cognitive_demand": "PASS",
        "accuracy_and_rigor": "PASS",
        "reveals_misconceptions": "PASS",
        "question_type_appropriateness": "PASS",
        "engagement_and_relevance": "PASS",
        "instructional_support": "PASS",
        "clarity_and_accessibility": "PASS",
        "pass_count": 9,
        "fail_count": 0
      },
      "final_score": 0.904
    },
    "text1": {
      "math_content_evaluator": {
        "overall_score": 0.778,
        "overall_rating": "ACCEPTABLE",
        "pass_count": 7,
        "fail_count": 2
      },
      "text_content_evaluator": {
        "overall": 0.957,
        "correctness": 1.0,
        "grade_alignment": 0.95,
        "language_quality": 0.9,
        "pedagogical_value": 0.95,
        "explanation_quality": 1.0,
        "di_compliance": 0.9,
        "instruction_adherence": 0.95,
        "query_relevance": 1.0,
        "recommendation": "accept",
        "issues": [],
        "strengths": ["Clear conceptual explanation", "Age-appropriate language"],
        "suggested_improvements": ["Add more real-world examples"],
        "di_scores": {...}
      },
      "final_score": 0.867
    },
    "article1": {
      "article_holistic_evaluator": {
        "pedagogical_coherence": 0.85,
        "content_organization": 0.90,
        "scaffolding_quality": 0.80,
        "engagement": 0.75,
        "mixed_media_integration": 0.85,
        "learning_objectives_clarity": 0.80,
        "grade_appropriateness": 0.90,
        "completeness": 0.85,
        "cognitive_load_management": 0.80,
        "instructional_clarity": 0.85,
        "overall": 0.835,
        "recommendation": "accept",
        "issues": [
          "Some transitions between sections could be smoother",
          "Question 1 appears slightly before full concept explanation"
        ],
        "strengths": [
          "Excellent use of visual diagrams to support text",
          "Clear learning progression from basic to advanced",
          "Engaging real-world examples throughout"
        ],
        "suggested_improvements": [
          "Add transitional sentences between sections",
          "Consider moving Question 1 after more detailed concept introduction"
        ]
      },
      "text_content_evaluator": {
        "overall": 0.89,
        "correctness": 0.95,
        "grade_alignment": 0.90,
        "language_quality": 0.85,
        "pedagogical_value": 0.90,
        "explanation_quality": 0.88,
        "di_compliance": 0.87,
        "instruction_adherence": 0.92,
        "query_relevance": 0.95
      },
      "embedded_questions": {
        "q1": {
          "ti_question_qa": {
            "overall": 0.87,
            "recommendation": "accept"
          },
          "answer_verification": {
            "is_correct": true,
            "confidence": 9
          }
        },
        "q2": {
          "ti_question_qa": {
            "overall": 0.90,
            "recommendation": "accept"
          },
          "answer_verification": {
            "is_correct": true,
            "confidence": 10
          }
        }
      },
      "images": {
        "img1": {
          "image_quality_di_evaluator": {
            "overall": 0.95,
            "score": 95,
            "recommendation": "accept"
          }
        }
      },
      "final_score": 0.880
    }
  },
  "evaluation_time_seconds": 38.76
}

Full mode includes: Detailed dimension scores, issues, strengths, recommendations, DI compliance breakdowns, and section-level evaluations. For articles (NEW in v1.4.0): Includes holistic evaluation across 10 dimensions plus component-level evaluations for text, embedded questions, and images.

How It Works

Simple 3-Step Process:

Prepare your content → Structure your questions, text content, or articles in JSON format (include image_url for visual content)
Add routing parameters → Optionally specify subject, grade, and type for better evaluation routing
Send to API → POST to https://api.inceptapi.com/evaluate and receive comprehensive quality scores

The evaluator automatically selects and runs the appropriate evaluation methods based on your content and parameters. No manual configuration needed.

Article Evaluation (v1.4.0): Complete educational documents evaluated holistically as unified pedagogical experiences, with automatic component evaluation for text, questions, and images.

Automatic Image Detection (v1.3.0): If any content includes an image_url, image quality evaluation is automatically enabled using DI rubric-based assessment.

API Endpoint: https://api.inceptapi.com/evaluate

Current Version: InceptBench v1.4.0

Resources

API Endpoint: https://api.inceptapi.com/evaluate
API Documentation: Contact support for API key and detailed documentation

For questions or support, please contact the Incept team.

InceptBench

How It Works

Content Types Supported

Quick Start

Routing Parameters

Evaluation Methods

TI Question QA

Math Content Evaluator

Answer Verification

Reading Question QC

Text Content Evaluator

How Evaluation Methods Are Selected

Technical Reference

Input Format

Output Format

Simplified (Default, verbose: false)

Full Mode (verbose: true)

How It Works

Resources

Simplified (Default, `verbose: false`)

Full Mode (`verbose: true`)