Key takeaways
- Domain recommendations face unique challenges: inventory shrinks permanently as names are registered, popular TLDs like .COM have under 5% availability for common keywords, and new users lack the historical data traditional systems rely on.
- GoDaddy's ReTiRe system closes the feedback loop within seconds rather than days, capturing search patterns, cart behavior, and TLD preferences to personalize recommendations mid-session — even for first-time visitors.
- The team fine-tuned LLaMA-3 8B using LoRA and INT8 quantization, enabling cost-effective training while serving over a million daily requests at sub-150ms latency.
At GoDaddy, we process millions of domain searches daily across our platform serving over 20 million customers worldwide. With hundreds of millions of domain names already registered globally and thousands more registered every day, the challenge of helping customers find the perfect available domain has never been more critical. In this high-volume environment, three critical challenges emerged that traditional recommendation systems struggled to address.
- Unique inventory constraints: With over 360 million registered domain names globally and the number growing by tens of thousands daily, each domain name is unique and once registered, becomes permanently unavailable. This creates a constantly shrinking pool of available options, necessitating real-time relevance and timeliness in our suggestions.
- Imbalanced demand: Popular TLDs, particularly .COM, represent over 40% of all registered domains, yet remain the most sought-after option. With .COM availability rates often below 5% for common keywords, finding desirable and available names requires sophisticated personalization that goes beyond simple keyword matching.
- Complex personalization needs: Our diverse customer base across different regions and industries requires nuanced personalization. A startup in Silicon Valley might prioritize brandable .COM domains, while a small business in Europe may prefer country-code TLDs or industry-specific extensions. Each customer's preferences, search history, and intent patterns demand a tailored approach.
Additionally, catering to new users with little or no historical data adds another layer of complexity. Traditional recommendation systems that rely heavily on user history fall short when serving first-time visitors who represent a significant portion of our search traffic.
Recognizing these challenges, we embarked on a journey to develop a solution that addresses the complex demands of domain name personalization. This led to the creation of the Personalized Generative AI (PGen AI), a system designed to provide highly personalized domain name suggestions.
What is PGen AI?
PGen AI is a production-grade system that leverages a fine-tuned small language model (SLM) integrated with GoDaddy's existing domain search infrastructure. At its core, PGen AI combines a fine-tuned 8B SLM with our proprietary domain datasets and real-time customer signal processing to deliver highly personalized domain name recommendations. Approximately 400 million recommendations are precomputed in a one-time batch process based on customer interests and intent patterns, then inserted into a vector store that is leveraged during real-time inference. This batch precomputation process can be rerun as needed to refresh the recommendations. This system currently handles more than a million requests per day with an average latency of under 150ms, making it an important component of GoDaddy's domain search experience.
The architecture integrates the SLM component seamlessly with our existing recommendation pipeline. When a customer submits a search query, the system routes it through our Real-Time Relevance (ReTiRe) system, which processes customer signals such as intent patterns, interaction history, and geographic preferences. These signals are then fed into the fine-tuned 8B parameter model, which generates domain suggestions that are subsequently filtered through our vector database for availability checking and relevance ranking. This integration ensures that the LLM's generative capabilities work in harmony with our existing domain inventory and availability systems.
For the model selection, we chose 8B SLM due to its optimal balance between computational efficiency, cost effectiveness, and capability for generating complex, contextually relevant text. The 8B parameter size provides sufficient model capacity to understand nuanced customer preferences and generate creative domain suggestions, while remaining computationally feasible for real-time inference at scale. The fine-tuning process, executed using Amazon SageMaker JumpStart, adapted the foundational model to understand domain-specific patterns, customer preferences, and the unique constraints of domain availability. This fine-tuned model enables PGen AI to generate suggestions that are not only linguistically coherent but also aligned with GoDaddy's inventory constraints and customer personalization needs.
Benefits of PGen AI
The following list describes the benefits of PGen AI:
- Improved relevancy and personalization: PGen AI leverages customer-specific data, such as favored TLDs, frequently searched keywords, and recent search behaviors, to deliver domain name suggestions that are highly relevant to individual users. This ensures that each recommendation is closely aligned with the customer's preferences and needs.
- Contextually relevant recommendations: PGen AI enriches open-source LLMs with GoDaddy's proprietary domain intelligence, enabling recommendations that reflect customer intent, market availability, and brand constraints—resulting in more actionable and personalized domain suggestions.
- Enhanced diversity of domain suggestions: By considering a wide range of TLDs and industry-specific keywords, PGen AI offers diverse domain name options. This increases the likelihood of finding available and appealing domains, even in highly competitive TLDs like .COM.
- Proactive and adaptive suggestions: PGen AI adapts to users with little or no historical data by employing real-time market trend analysis to generate relevant domain suggestions. This proactive approach ensures that even new users receive high-quality recommendations tailored to current market conditions.
Core components
The following sections describe the two core components of PGen AI: ReTiRe and a fine-tuned SLM.
ReTiRe system
ReTiRe is our real-time personalization engine that feeds machine learning models with continuously updated customer signals (intent, preferences, patterns) to deliver dynamic, personalized domain recommendations. Unlike traditional systems that update customer signals daily or monthly, ReTiRe closes the feedback loop within seconds, enabling recommendations to adapt to customer behavior in real-time during the same search session.
The system collects and processes multiple signal types in real-time:
- Search intent signals: Query text, search frequency, query refinement patterns, and frequently occurring keywords across searches (e.g., a customer repeatedly searching for "ikea" and "installs" indicates these terms are important)
- Interaction patterns: Positive signals such as add-to-cart actions and domain selections, as well as negative signals like shown-but-not-clicked results, captured within the same session
- Preference signals: TLD ownership, preferred second-level domain (SLD) length, spending power, and TLD selections inferred from user interactions
- Session context: Real-time analysis of customer search intent within a particular session, including SLD spin patterns (when customers keep changing SLDs), geographic location preferences, and device type
- Visitor profile: Detailed insights into customer search intent and shopper features, including past 7 days of profile history
- Cart details: Information about domains added to cart, including list prices, renewal prices, current prices, and inventory type, which provides direct feedback on customer intent
The technical implementation delivers personalized information to models within a couple of seconds after a customer signal is triggered, with our Lambda REST endpoint providing API responses in under 80ms. The infrastructure supports multi-region deployment across our four regions, ensuring low-latency access for customers worldwide. ReTiRe provides a suite of APIs that deliver customer context across multiple dimensions from ownership patterns and spending behavior to real-time market trends and in-session intent analysis. For new users without historical data, the system accumulates session-level signals in real-time, enabling PGen AI to provide relevant suggestions even for first-time visitors.
Fine-tuned SLM
For PGen AI, we fine-tuned a LLaMA-3 8B small language model using low-rank adaptation (LoRA) combined with INT8 quantization to efficiently specialize the model for domain recommendations. LoRA freezes the base model weights and introduces lightweight, trainable low-rank matrices into selected transformer components — primarily the query and value projection layers of self-attention — drastically reducing the number of trainable parameters while retaining model expressiveness.
To further optimize training efficiency, the base model was loaded using INT8 quantization, significantly reducing GPU memory footprint and bandwidth requirements without materially impacting model quality. This enabled fine-tuning large models on smaller, cost-effective GPU instances while improving training stability and throughput. Only the LoRA adapter weights were updated in higher precision, ensuring accurate gradient updates while keeping the quantized backbone fixed.
The model was trained on curated domain data, user query intent, and relevance signals, enabling fast iteration cycles, lower operational costs, and scalable deployment. The final system delivers low-latency inference and highly personalized, context-aware domain recommendations at production scale.
Conclusion
PGen AI represents a significant leap forward in personalized domain recommendations, combining fine-tuned language models with real-time customer signals to deliver highly relevant suggestions at scale. Through its sophisticated use of real-time data and proprietary technologies, PGen AI is not just reacting to the needs of the digital marketplace; it is anticipating them, ensuring that every customer interaction is as relevant and productive as possible.
Lessons learned
Building PGen AI taught us several valuable lessons. One of the most surprising discoveries was how effectively LoRA combined with INT8 quantization enabled us to fine-tune large models on cost-effective infrastructure without sacrificing quality. We initially expected more trade-offs between model performance and computational efficiency, but the combination proved remarkably effective for our use case.
The biggest challenge we faced was integrating real-time customer signals with our batch precomputation pipeline. Ensuring that the 400 million precomputed recommendations remained relevant while incorporating dynamic, session-level signals required careful architectural decisions. We learned that designing for flexibility from the start—allowing the system to seamlessly blend batch and real-time components—was crucial for maintaining both performance and personalization quality.
Looking ahead, we're excited to explore several directions. We're investigating how to further reduce latency while maintaining recommendation quality. Additionally, as new language models emerge, we're evaluating how they might improve our ability to understand nuanced customer intent and generate even more creative domain suggestions.
The journey of building PGen AI has reinforced our belief that combining open-source innovation with domain-specific expertise can create powerful solutions. We're grateful for the open-source community's contributions and look forward to continuing to push the boundaries of what's possible in AI-driven domain search.
Resources
For readers interested in learning more about the technologies and approaches discussed in this post go to the following resources:
- AWS SageMaker JumpStart Documentation - Amazon's platform for fine-tuning foundation models
PGen AI builds upon several open source technologies that have been instrumental in its development. We're grateful to the open source community for the following foundational technologies that enable innovation in AI-driven domain search:
- LLaMA-3 Repository - The foundational language model powering PGen AI is LLaMA-3 8B by Meta. This open source model provided the base capabilities that we fine-tuned for domain-specific recommendations.
- BGE Transformer Repository - Our vector search implementation leverages BGE (BAAI General Embedding) Transformer embeddings by BAAI (Beijing Academy of Artificial Intelligence) for calculating semantic similarity between queries and domain names.
- Qdrant - We use Qdrant as our vector database to store and efficiently query the 400 million domain recommendations. Qdrant's high-performance vector search engine enables fast similarity searches at scale, making it ideal for our real-time inference requirements.






