Hostname: page-component-78c5997874-j824f Total loading time: 0 Render date: 2024-11-03T08:24:48.465Z Has data issue: false hasContentIssue false

Designing a machine translation system for Canadian weather warnings: A case study

Published online by Cambridge University Press:  30 January 2013

FABRIZIO GOTTI
Affiliation:
RALI-DIRO – Université de Montréal, C.P. 6128, Succ. Centre-Ville Montréal, Québec, CanadaH3C 3J7 email: [email protected], [email protected], [email protected]
PHILIPPE LANGLAIS
Affiliation:
RALI-DIRO – Université de Montréal, C.P. 6128, Succ. Centre-Ville Montréal, Québec, CanadaH3C 3J7 email: [email protected], [email protected], [email protected]
GUY LAPALME
Affiliation:
RALI-DIRO – Université de Montréal, C.P. 6128, Succ. Centre-Ville Montréal, Québec, CanadaH3C 3J7 email: [email protected], [email protected], [email protected]

Abstract

In this paper we describe the many steps involved in building a production quality Machine Translation system for translating weather warnings between French and English. Although in principle this task may seem straightforward, the details, especially corpus preparation and final text presentation, involve many difficult aspects that are often glossed over in the literature. On top of the classic Statistical Machine Translation evaluation metric results, four manual evaluations have been performed to assess and improve translation quality. We also show the usefulness of the integration of out-of-domain information sources in a Statistical Machine Translation system to produce high quality translated text.

Type
Articles
Copyright
Copyright © Cambridge University Press 2013 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bertoldi, N., Haddow, B., and Fouet, J.-B. 2009. Improved minimum error rate training in moses. The Prague Bulletin of Mathematical Linguistics 91: 716.Google Scholar
Chandioux, J. 1988. meteo: an operational translation system. In Proceedings of the 2nd Conference on RIAO, Cambridge, MA, pp. 829–39.Google Scholar
Chen, S. F., and Goodman, J. 1999. An empirical study of smoothing techniques for language modeling. Computer Speech and Language (Elsevier) 13 (4): 359–93.CrossRefGoogle Scholar
Clark, J. H., Dyer, C., Lavie, A., and Smith, N. A. 2011. Better hypothesis testing for statistical machine translation: controlling for optimizer instability. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Portland, OR. Stroudsburg, PA: Association for Computational Linguistics, pp. 176–81.Google Scholar
Foster, G., Kuhn, R., and Johnson, J. H. 2006. Phrasetable smoothing for statistical machine translation. In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, pp. 5361.Google Scholar
Isabelle, P. 1987. Machine translation at the TAUM group. In King, M. (ed.), Machine Translation Today: The State of the Art, pp. 247–77. Edinburgh, UK: Edinburgh University Press.Google Scholar
Johnson, J. H., Martin, J., Foster, G., and Kuhn, R. 2007. Improving translation quality by discarding most of the phrasetable. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Association for Computational Linguistics, Stroudsburg, PA, pp. 967–75.Google Scholar
Koehn, P., Axelrod, A., Mayne, A. B., Callison-Burch, C., Osborne, M., and Talbot, D. 2005. Edinburgh system description for the 2005 IWSLT speech translation evaluation. In Proceedings of the International Workshop on Spoken Language Translation, Pittsburgh, PA.Google Scholar
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., and Herbst, E. 2007. Moses open source toolkit for statistical machine translation. Annual Meeting of the Association for Computational Linguistics (ACL) 45 (2): 2.Google Scholar
Langlais, P. 1997. Alignement de corpus bilingues: intérêts, algorithmes et évaluations. In Actes du colloque international FRACTAL 1997, Linguistique et Informatique: Théories et Outils pour le Traitement Automatique des Langues, Besançon, France, pp. 245–54.Google Scholar
Langlais, P., Gandrabur, S., Leplus, T., and Lapalme, G. 2005. The long-term forecast for weather bulletin translation. Machine Translation 19 (1): 83112 (Kluwer, Hingham, MA).Google Scholar
Macklovitch, E. 1985. A linguistic performance evaluation of METEO 2. Technical Report, Canadian Translation Bureau, Montreal, Canada.Google Scholar
Mitkov, R. 2005. The Oxford Handbook of Computational Linguistics. New York: Oxford University Press.Google Scholar
Och, F. J. 2003. Minimum error rate training in statistical machine translation. In Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, pp. 160–7.Google Scholar
Papineni, K., Roukos, S., Ward, T., and Zhu, W. J. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA: Association for Computational Linguistics, pp. 311–8.Google Scholar
Simard, M., and Deslauriers, A. 2001. Real-time automatic insertion of accents in French text. Natural Language Engineering, 7 (2): 143–65 (Cambridge University Press, New York).Google Scholar
Stolcke, A. 2002. SRILM – an extensible language modeling toolkit. In Proceedings of the International Conference on Spoken Language Processing (ICSLP-2002), Denver, CO, pp. 901–4.Google Scholar
Verret, R., Vigneux, D., Marcoux, J., Petrucci, F., Landry, C., Pelletier, L., and Hardy, G. 1997. Scribe 3.0, a product generator. In Proceedings of the 13th International Conference on Interactive Information and Processing Systems for Meteorology, Oceanography and Hydrology, Long Beach, CA, pp. 392–5.Google Scholar