About MICUSP

The Michigan Corpus of Upper-level Student Papers (MICUSP), compiled at the English Language Institute of the University of Michigan, Ann Arbor, is a new corpus of student academic writing samples. The corpus, the first of its kind in North America, will enable corpus researchers, EAP teachers and testers to investigate the written discourse of highly proficient, advanced-level native and non-native speaker student writers at a large American research university.

The corpus was made freely available to the global research and teaching community through a simple online search and browse interface in December 2009. A more complex search interface and an XML-annotated corpus version for offline use are currently under development.

MICUSP consists of around 830 papers (roughly 2.6 million words) of different types (e.g. essays, reports, response papers) from altogether 16 disciplines within four academic divisions (Humanities and Arts, Social Sciences, Biological and Health Sciences, and Physical Sciences). All papers included in MICUSP were written by final year undergraduate and graduate students who obtained an A grade for their paper.

Each of the papers in MICUSP has been marked up in XML and maintains the structural divisions (sections, headings, paragraphs) of the original paper. A file header that has been added to each MICUSP file includes, among other things, information about the discipline and the student’s level, native-speaker status, and sex, which makes it possible to carry out customized searches in subsections of the corpus, e.g. only in Biology papers written by native-speaker final year undergraduate students.

The project was launched in late 2004 by Rita Simpson-Vlach and John Swales. From 2005 to 2007 the project was managed by Annelie Ädel. The current MICUSP project director is Ute Römer.

Contact / About Us