Research & Development

My background spans both computational linguistics research and NLP-driven product development, with a strong focus on semantic search.

Academic

In my academic studies in the Computational Linguistics group of Computer Science Department at the University of Toronto, I was focused on representations of partial orders, and in particular, structural analysis of type hierarchies that are designed by linguists in unification-based grammar parsing and generation systems which resulted in the following publications:

  • Rouzbeh Farahmand: Analysis of Near-Meet-Semilattices for Typed Unification-based Grammar Design. M.Sc. Thesis, University of Toronto, January 2010. (see below or download a copy)

  • Rouzbeh Farahmand, Gerald Penn: Flexible Structural Analysis of Near-Meet-Semilattices for Typed Unification-based Grammar Design. COLING 2012: 24th International Conference on Computational Linguistics, 8-15 December 2012, IIT Bombay, Mumbai, India. (see ACL anthology for a copy)

It was also during my M.Sc. studies when I was fortunate enough to be exposed to some interesting concepts and problems in logic and set theory - specifically finding Cardinalities of special sets that are related to hard problems such as Dedekind Problem.

Industry

In the industry, I have directed and contributed to numerous R&D efforts in applied NLP across domains such as finance, construction tech, and social listening. Key areas include:

  • Query Language Design: Designed custom query languages and interpreters using tools like ANTLR, including pipelines for translating user-defined logic into performant ElasticSearch queries.

  • Semantic Search & Information Retrieval: Designed and deployed production-grade hybrid search architectures that combine symbolic domain ontologies, vector embeddings, and transformer models to support semantic understanding as well as writing white papers for the emerging conversational search paradigms.

  • Natural Language Processing (NLP): Developed and operationalized NLP systems for tasks such as named entity recognition, sentiment and emotion analysis, auto-correction, and multilingual processing.

  • Sentiment & Emotion Analysis: Built and refined NLP pipelines to detect nuanced emotional signals in text, especially in the context of social media and customer feedback analysis.

My ongoing intellectual interests include:

  • New Paradigms of Search: Particularly Retrieval-Augmented Generation (RAG), which I’ve actively prototyped and deployed in both customer-facing and enterprise-facing applications. I am especially interested in the interplay between vector retrieval, semantic routing, and LLM-based generation in building contextualized and explainable systems.

  • Distributional Semantics: Investigating how meaning is encoded in vector space models and how these can be leveraged for topic modeling, and semantic similarity and applied to information retrieval.

  • Early Language Acquisition: Exploring how children acquire syntactic and semantic structures with limited supervision and data.