Linguistic synesthesia as a productive figurative language usage has received little attention in the field of Natural Language Processing (NLP). Although linguistic synesthesia is similar to metaphor concerning involving conceptual mappings and showing great usefulness in the NLP tasks such as sentiment analysis and stance detection, the well-studied methods of metaphor detection cannot be applied to the detection of linguistic synesthesia directly. This study incorporates comprehensive linguistic features (i.e., character and radical information, word segmentation information, and part-of-speech tagging) into a neural model to detect linguistic synesthetic usages in a sentence automatically. In particular, we employ a span-based boundary detection model to extract sensory words. In addition, a joint model is proposed to detect the original and synesthetic modalities of the sensory words collectively. Based on the experiments, our model is shown to achieve state-of-the-art results on the dataset for linguistic synesthesia detection. The results prove that leveraging culturally enriched linguistic features and joint learning are effective in linguistic synesthesia detection. Furthermore, as the proposed model leverages non-language-specific linguistic features, the model would be applied to the detection of linguistic synesthesia in other languages.