THIS PROJECT MIGRATED TO https://gitlab.com/mwetoolkit/mwetoolkit3/

The Multiword Expressions toolkit aids in the automatic identification and extraction of multiword units in running text. These include idioms (kick the bucket), noun compounds (cable car), phrasal verbs (take off, give up), etc.

Even though it focuses on multiword expresisons, the framework is quite complete and can also be useful in any corpus-based study in computational linguistics.

The mwetoolkit can be applied to virtually any text collection, language, and MWE type. It is a command-line tool written mostly in Python. Its development started in 2010 as a PhD thesis but the project keeps active (see the SVN logs).

Up-to-date documentation and details about the tool can be found on the mwetoolkit website: http://mwetoolkit.sourceforge.net/

Features

  • Multi-level RegEx patterns
  • Large corpora support
  • Association measures
  • Token-based annotation

Project Samples

Project Activity

See All Activity >

License

GNU General Public License version 3.0 (GPLv3)

Follow mwetoolkit

mwetoolkit Web Site

Other Useful Business Software
Enterprise and Small Business CRM Solution | Clear C2 C2CRM Icon
Enterprise and Small Business CRM Solution | Clear C2 C2CRM

Voted Best CRM System with Top Ranked Customer Support. CRM Management includes Sales, Marketing, Relationship Management, and Help Desk.

C2CRM consists of four modules that integrate to provide a comprehensive CRM solution: Relationship Management, Sales Automation, Marketing Automation, and Customer Service. Only buy what each user needs.
Rate This Project
Login To Rate This Project

User Ratings

★★★★★
★★★★
★★★
★★
1
0
0
0
0
ease 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
features 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
design 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5
support 1 of 5 2 of 5 3 of 5 4 of 5 5 of 5 0 / 5

User Reviews

  • Works fine, easy to use, and the documentation is clear.
Read more reviews >

Additional Project Details

Operating Systems

Cygwin, Linux, BSD, Mac

Languages

English

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

Unix Shell, Python, C

Database Environment

XML-based, Flat-file

Related Categories

Unix Shell Artificial Intelligence Software, Unix Shell Linguistics Software, Unix Shell Command Line Tools, Python Artificial Intelligence Software, Python Linguistics Software, Python Command Line Tools, C Artificial Intelligence Software, C Linguistics Software, C Command Line Tools

Registered

2010-04-08