Communication is a fundamental step in the process of political representation, and an influential stream of research hypothesizes that male and female politicians talk to their constituents in very different ways. To build the broad dataset necessary for this analysis, we harness the massive trove of communication by American politicians through Twitter. We adopt a supervised learning approach that begins with the hand coding of over 10,000 tweets and then use these to train machine learning algorithms to categorize the full corpus of over three million tweets sent by the lower house state legislators who were serving in the summer of 2017. Our results provide insights into politicians’ behavior and the consequence of women’s underrepresentation on what voters learn about legislative activity.