Corpora
The Applied Linguistics WWW Virtual Library - Data Archives index
WWW Sites of Interest
Machine Learning for Information Retrieval (html)
Information Extraction
Newswires
Comprehensive index of on-line news sources (US, Canadian, Scientific, Business, Comics
US on-line news sources (index #1)
News stories, newspapers, other news sources (International + US)
WWW Media/News Resources
US on-line news sources (index #2)
Annotated list of resources on statistical natural language processing and corpus-based computational linguistics
The Applied Linguistics WWW Virtual Library home page
Leeds NLP resources page