Unifying AI Tutor Evaluation An Evaluation Taxonomy for Pedagogical Ability Assessment of LLM-Powered AI Tutors
Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring
Essential-Web v1.0 24T tokens of organized web data
Ask-Before-Detection - Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions
AnnoLLM - Making Large Language Models to Be Better Crowdsourced Annotators
FinerWeb-10BT Refining Web Data with LLM-Based Line-Level Filtering