Finweb-Edu-Chinese-v2.2

License icon

License:

Apache-2.0

Shield icon

Steward:

OpenCSG

Task: LLM

Release Date: 2/17/2026

Format: parquet

Size: 624.68 MB


Share

Description

Fineweb-Edu-Chinese v2.2 is the updated Fineweb-derived dataset of refined Chinese educational web content. It enhances content quality and expands education-stage coverage, fitting education-focused LLM training & educational AI tools. Get the dataset at www.opencsg.com.

Specifics

Licensing

Apache License 2.0 (Apache-2.0)

https://spdx.org/licenses/Apache-2.0.html

Considerations

Restrictions/Special Constraints

This dataset is intended solely for academic research and non-commercial educational purposes. Users must attribute the dataset as "Fineweb-Edu-Chinese v2.2 (provided via www.opencsg.com)" in any derivative works or publications, and comply with the associated open-source license terms.

Forbidden Usage

Commercial profit-making activities (e.g., integrating the dataset into paid educational products or services); Generating harmful, illegal, discriminatory, or misleading educational content; Infringing upon the legitimate rights and interests of any individual or organization; Any usage that violates applicable laws, regulations, or ethical guidelines.

Metadata