This project aims to identify whether the patterns of writing of students reflect the evaluations of attendings in a fourth year emergency medicine clerkship. Our underlying conceptual hypothesis is that students reflect their attitude to learning in their communication and that attendings evalaute medical students mostly on their attitude. (This assumes a minimal ammount of medical knowledge. Actual clinical knowledge was assessed on another test.)
We used Biterm Topic Modeling (Yan et al, 2013) to identify combinations of words that occur in the same context.
The model calculates three matrices:
- *.pw_z : A matrix of topics x words where entry ij indicates the posterior probability that word j is associated with topic i.
- *.pz : A vector of topics where entry i indicates the prior probability of a word occurring.
- *.pz_d : A matrix of documents x topics where entry ij indicates the proportion to which topic j contributes to document i.
##Quickstart python setup.py
##Analysis
Comments are stored in comments.csv
##References
Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng. A Biterm Topic Model For Short Text. WWW2013.