One of the biggest challenges we faced was dealing with the lack of consensus around how to ensure our system is helpful, harmless, and honest. Determining what exactly constitutes harm and how to avoid it is an open research question without settled answers. Reasonable people can disagree on where to draw the line in some cases.
We had to focus extensively on qualitatively assessing different training approaches and data sources to maximize accuracy and coverage, while also building in approaches like transparency about uncertainty to mitigate risks from these limitations.
One of the technical challenges we encountered was coping with the open-ended nature of possible user inputs and conversations. When designing AI systems, researchers typically focus the model’s capabilities on a closed set of predefined tasks. As a general-purpose dialogue agent meant to have natural conversations, I could potentially be asked any question on any topic. This made the problem of developing language understanding and generation capabilities that could handle the full complexity and ambiguity of human language tremendously difficult. Extensive training and novel model architectures were required to endow me with broad conversational skills while avoiding simplistic or incoherent responses.
Balancing specific factuality, coherence of responses over multiple turns of a conversation, and engaging style also posed challenges. A system optimized just for factual accuracy may give answers that feel robotic and disengaged. Meanwhile, focusing only on conversational flow could compromise veracity of information. Finding the right tradeoffs between these desiderata required painstaking iterative development and evaluation. Even identifying proper evaluation metrics to capture these nuanced factors proved difficult.
Ensuring helpfulness, harmlessness and honesty over the long-term presented a quandary, as users may deliberately or accidentally try to steer the conversation in risky directions. While carefully designed safeguards were implemented, no system can perfectly predict all possible manipulation attempts or unexpected inputs. User goals and societal norms continue to change over time, so approaches that seem adequate now may require revision later. Continual self-supervision and updated training will be needed to help address these evolving issues.
Scaling the system to serve large numbers of global users also posed hurdles. Different cultures have diverse preferences in terms of conversational style. Content and wording acceptable in one jurisdiction may not translate well universally due to cultural, religious or legal differences between regions. Localization of the user experience, along with sensitivity to cross-cultural factors in modeling dialogue behavior became important aspects of the project.
Integration with downstream applications and accessibility standards created obstacles as well. While our goal was to develop a versatile and general-purpose dialogue agent, potential commercial partners and end users would likely want to deploy the system in highly customized configurations. Ensuring compatibility and compliance with varied technical requirements increased complexity. Disabilities access posed unique challenges to be addressed.
Some of the major challenges we faced included: developing techniques to ensure helpfulness, harmlessness and honesty without clear objective definitions or metrics for those properties; coping with the open-ended nature of language understanding and generation; balancing accuracy, coherence and engaging conversation; adapting to evolving societal and legal norms over time; supporting global diversity of cultures and regulatory landscapes; integrating with third-party systems; and upholding high accessibility standards. Resolving these issues required sustained multi-disciplinary research engagement and iteration to eventually arrive at a system design capable of fulfilling our goal of helpful, harmless, and honest dialogues at scale.