Persian Dependency Treebank

Persian Dependency Treebank (PerDT)

 

A collection of approximately 30,000 Persian sentences with syntactic and morphological annotations, useful for natural language processing and computational linguistics.

 

 

Documentation: Persian Dependency Treebank Annotation Manual and User Guide

Related Tools:

MST parser implementation in C#

Persian Dependency Treebank Normalization Script - A simple python program which changes verbs like «گفته می‌شود» to «گفته_می‌شود» in the Persian Dependency Treebank. (This changes the Treebank data to the Standard CONLL Format in which white spaces are not allowed.)