Persian Dependency Treebank

Persian Dependency Treebank (PerDT)

 

The Persian Dependency Treebank is a collection of approximately 30,000 Persian sentences with syntactic and morphological annotations, useful for natural language processing and computational linguistics.

 

 

Documentation:

 

- Mohammad Sadegh Rasooli, Manouchehr Kouhestani, and Amirsaeid Moloodi. (2013). Development of a Persian Syntactic Dependency Treebank. In The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT), Atlanta, USA.

 

 

Related Tools:

 

Dadegan Search - An online tool for exploring the Persian Dependency Treebank and the Valencey Lexicon for Persian Verbs

 

MST parser implementation in C#

 

Persian Dependency Treebank Normalization Script - A simple python program which changes verbs like «گفته می‌شود» to «گفته_می‌شود» in the Persian Dependency Treebank. (This changes the Treebank data to the Standard CONLL Format in which white spaces are not allowed.)

 

 

Project Members:

·         Project Head and Computational Linguistic Research

o   Mohammad Sadegh Rasooli: MSc. AI, Iran University of Science and Technology

·         Linguistic Research and Instruction

o   Manouchehr Kouhestani:  PhD candidate, Linguistics, Tarbiat Modares University

o   Amirsaeid Moloodi: PhD candidate, Linguistics, University of Tehran

·         Linguistic Annotation

o   Farzaneh Bakhtiary: MA student, Linguistics, University of Tehran

o   Parinaz Dadras: MA student, Linguistics, University of Tehran

o   Maryam Faal-Hamedanchi: PhD, Linguistics, Peoples' Friendship University of Russia

o   Saeedeh Ghadrdoost-Nakhchi: MA student, Linguistics, University of Tehran

o   Mostafa Mahdavi: PhD candidate, Linguistics, Institute for Humanities and Cultural Studies, Tehran

o   Azadeh Mirzaei: PhD candidate, Linguistics, Allameh Tabatabaei University, Tehran

o  Sahar Oulapoor: MA, Linguistics, University of Tehran

o   Neda Poormorteza-Khameneh: MA, Persian Language and Literature, Islamic Azad University

o   Morteza Rezaei: MA student, Computational Linguistics, Sharif University of Technology

o   Sude Resalatpoo: MA, LinguisticsIslamic Azad University

o   Fatemeh Shafie: MA, Linguistics, University of Tehran

o   Salimeh Zamani: MA, Linguistics, Islamic Azad University

·         Programming Support

o   Seyed Mahdi Hoseini: MSc., AI, Iran University of Science and Technology

o   Alireza Noorian: MSc. student, AI, Iran University of Science and Technology

o   Yasser Souri: MSc. student, AI, Sharif University of Technology

·         Web Support

o   Mohsen Hossein-Alizadeh, Web developer, SCICT

 

(Contact us for more information about the treebank and for inquiries about receiving the data.)