Databricks Certified Generative AI Engineer Associate

The Databricks Certified Generative AI Engineer Associate is a specialized credential launched in 2024 to validate a professional's ability to design, build, and deploy large language model (LLM) applications on the Databricks Data Intelligence Platform. It specifically emphasizes building production-ready Retrieval-Augmented Generation (RAG) systems and LLM chains.

---------- Question 1

An organization is developing a Generative AI application that summarizes customer feedback. To comply with privacy regulations and avoid legal risks, certain personally identifiable information (PII) must be excluded from the summaries. The application uses source documents that are known to contain PII. Which of the following is the most effective guardrail technique to meet performance objectives and protect against the leakage of sensitive information?

Implement a simple word replacement filter for common PII terms before feeding documents to the LLM.
Use a sophisticated entity recognition model to identify PII, followed by masking or anonymization of detected PII in the source documents before they are processed by the LLM.
Instruct the LLM within the prompt to ignore any PII it might encounter in the source documents.
Avoid processing any documents that are known to contain PII altogether.

---------- Question 2

You are preparing a large collection of technical documentation, including user manuals, API references, and troubleshooting guides, for a RAG application that needs to answer highly specific questions about product configuration and error resolution by developers. The documents vary greatly in structure, with some having dense paragraphs of code, others with bulleted lists of parameters, and some with lengthy procedural descriptions. What chunking strategy would be most effective to maximize the accuracy and relevance of retrieved information for this type of RAG application, considering both document structure and model constraints?

Fixed-size character chunking with an overlap of 50 characters, to ensure that potentially important keywords are not split across chunks and are always present within a retrieval context.
Recursive character text splitting, using common separators like newlines and spaces, to break down documents into smaller, semantically coherent pieces that respect the natural structure of paragraphs and sections.
Content-aware chunking, potentially using markdown or HTML parsers to identify semantic boundaries like headings, subheadings, code blocks, and list items, creating chunks that represent logical units of information.
Paragraph-based chunking, assuming each paragraph contains a complete thought or instruction, and setting a large overlap to ensure context continuity between adjacent paragraphs.

---------- Question 3

You have deployed a RAG application that answers technical questions using internal documentation. The current performance is adequate, but you want to optimize costs by ensuring the LLM is not over-utilized or making unnecessary complex calls. While quantitative evaluation metrics like ROUGE or BLEU are useful for assessing the quality of summaries, what are the most appropriate Databricks features to specifically monitor to control LLM costs and assess deployed RAG application performance in a live environment?

MLflow experiment tracking for hyperparameter tuning and logging evaluation judges.
Inference logging with detailed prompts, responses, and token counts, coupled with inference tables for tracking LLM endpoint usage and latency.
Delta Live Tables for batch processing of documentation updates.
Unity Catalog's audit logs for data access patterns.

---------- Question 4

A company is developing an internal knowledge base AI that uses RAG to answer employee questions about HR policies. The AI needs to not only retrieve information from documents but also perform simple actions like calculating vacation days based on an employee's start date and policy rules. For a complex query about vacation accrual for a specific employee, the AI needs to first extract relevant policy details and the employee's start date from the context, then use a custom Python tool to perform the calculation, and finally present the result in a user-friendly sentence. How should you define the sequence of tools for this multi-stage reasoning process to ensure the correct execution and output?

Define a single tool that attempts to perform both information retrieval and the calculation. This simplifies the pipeline by consolidating all logic into one component.
Define a sequence of tools where the first tool retrieves the necessary policy documents and employee start date, followed by a calculation tool that takes these as inputs, and potentially a final tool to format the output. This allows for modularity and specific tool responsibilities.
First, provide the entire HR policy document to the LLM and ask it to answer the question, assuming it can infer the start date and perform calculations. No explicit tools are defined.
The retrieval tool should be designed to directly query a database for the employee's vacation days. The LLM's role is solely to formulate the query and present the retrieved data, without any intermediate calculation steps.

---------- Question 5

A financial services company wants to build a Generative AI application to automatically draft initial responses to customer inquiries related to account services. The primary requirement is that the drafted responses must be clear, concise, adhere to strict company branding guidelines, and always include a link to the relevant self-service portal article. Which design approach best addresses these requirements for the initial prompt engineering phase?

Develop a broad, open-ended prompt asking the LLM to 'assist customers with account inquiries'.
Craft a prompt that explicitly defines the desired output format (e.g., bullet points for key information), specifies the need for a specific tone and length, and instructs the LLM to always append a placeholder for the self-service link, to be filled programmatically.
Focus on prompt injection techniques to ensure the LLM prioritizes internal knowledge bases over external information sources.
Utilize a prompt that requests a comprehensive summary of all possible account services without specifying any output constraints or branding requirements.

---------- Question 6

A startup is deploying a customer support chatbot built with an LLM to answer common user queries. After deployment, they observe that while the chatbot is generally responsive, the quality of answers varies, sometimes being too generic or missing crucial details. They need to select key metrics to monitor that will effectively assess the RAG model's performance and identify areas for improvement, particularly concerning the retrieval and generation accuracy.

Monitor only API latency and the number of user sessions, as these directly reflect system usage and responsiveness.
Track metrics like 'Answer Relevance' and 'Factual Accuracy' using a combination of automated evaluation judges (e.g., another LLM) paired with ground truth data, and 'Retrieval Precision' and 'Recall' for the RAG component.
Focus on the number of generated tokens and the model's confidence scores, as higher values indicate better performance.
Monitor the cost per query and the diversity of topics covered by the chatbot, assuming a broad coverage implies good performance.

---------- Question 7

A company is using a RAG application to generate marketing copy based on product specifications. They have experimented with several LLM models from a model hub, each with different strengths and weaknesses, and have collected quantitative evaluation metrics from an MLflow experiment for tasks like relevance, coherence, and creativity. Given the goal of deploying the most performant model for this specific task, which of the following actions directly supports making an informed deployment decision based on these metrics?

Select the LLM that has the largest number of parameters, as this generally indicates better performance.
Review the MLflow experiment results and select the LLM that consistently scores highest across the defined quantitative evaluation metrics for relevance, coherence, and creativity.
Deploy the LLM that was used in the majority of recent research papers, assuming it is the current state-of-the-art.
Choose the LLM that has the fastest inference speed, as this is the most important performance indicator.

---------- Question 8

A financial services company wants to build a Generative AI application that analyzes customer feedback. The application needs to identify the sentiment (positive, negative, neutral) and categorize the feedback into predefined themes such as 'product features', 'customer support', or 'billing issues'. The desired output format for each feedback item is a JSON object containing two keys: 'sentiment' and 'category'. Which of the following prompt design strategies is most effective for eliciting this specific, structured JSON output from an LLM?

Provide a few example prompt-response pairs where the response is a JSON object with the specified keys.
Instruct the LLM to analyze the feedback and then separately instruct it to format the output as a JSON object with 'sentiment' and 'category' keys.
Use a simple, open-ended prompt asking the LLM to 'analyze the customer feedback and provide insights'.
Provide a detailed narrative description of what each sentiment and category means, and then ask the LLM to infer them.

---------- Question 9

You are preparing a large corpus of legal documents for a RAG application aiming to answer questions about contract law. These documents are complex, with nested clauses, definitions, and appendices. To optimize retrieval accuracy and prevent LLM context window issues, which chunking strategy would be most appropriate, considering both document structure and potential model constraints?

Fixed-size character-based chunking, disregarding document structure.
Sentence-based chunking while ensuring that semantic units like clauses and definitions are not arbitrarily split across chunks.
Paragraph-based chunking, which might sometimes split related sentences from different clauses.
Document-based chunking, returning the entire document as a single chunk, regardless of its size.

---------- Question 10

Your team is tasked with developing a Generative AI solution to analyze financial news articles and extract key information such as company names, reported earnings, and market sentiment. The business requires that the AI pipeline first identify relevant articles, then extract specific data points, and finally, categorize the sentiment for each identified company. Which of the following sequence of chain components and tools best represents a multi-stage reasoning process to achieve these business goals?

A knowledge gathering tool to search for financial news, followed by an action tool to extract company names, and then another action tool to categorize sentiment.
An LLM that takes the news article as input, a tool to identify relevant companies, a tool to extract earnings, and a tool to determine market sentiment, all orchestrated in a sequential chain.
A tool to filter articles based on keywords, an LLM to extract entities and sentiment from the filtered articles, and a final tool to aggregate the extracted information.
A knowledge gathering tool to collect financial news, an LLM to process the gathered news and output structured data, and a final tool to refine the LLM's output based on predefined rules.

Are they useful?

Click here to get 270 more questions to pass this certification at the first try! Explanation for each option is included!

Follow the below LINKEDIN channel to stay updated about 89+ exams!

AI Strategy and Engineering Hub

IT certifications practice questions

Search This Blog

Databricks Certified Generative AI Engineer Associate

Comments

Post a Comment

Popular posts from this blog

Microsoft Certified: Azure Fundamentals (AZ-900)

Google Associate Cloud Engineer

CompTIA Cybersecurity Analyst (CySA+)