openbook.gr v1.0

License icon

License:

CC-BY-NC-SA-4.0

Shield icon

Steward:

EELLAK - GreekFOSS

Task: NLP

Release Date: 1/27/2026

Format: Markdown (.md)

Size: 251.63 MB


Share

Description

This dataset provides a comprehensive Corpus of Greek Digital Books systematically aggregated from OpenBook.gr. Since its inception in 2010, the OpenBook platform has functioned as a central hub for the Greek open-access movement. The corpus features a robust variety of genres and formats, specifically curated to include only legal, freely distributable content. It serves as a vital resource for Natural Language Processing (NLP), linguistic analysis, and the preservation of Greek digital heritage, ensuring that both historical public domain texts and modern creative works remain accessible for computational study.

Specifics

Licensing

Creative Commons Attribution Non Commercial Share Alike 4.0 International (CC-BY-NC-SA-4.0)

https://spdx.org/licenses/CC-BY-NC-SA-4.0.html

Considerations

Restrictions/Special Constraints

Must comply with the license

Forbidden Usage

Non commercial use

Metadata

This dataset is a comprehensive corpus of Greek digital books collected from OpenBook.gr.It includes a wide range of legally and freely distributable genres and formats, supporting NLP research, linguistic analysis, and the preservation of both historical and contemporary Greek digital literature.