sábado, 27 de julho de 2024

Certificação para a OCI Generative AI

*Atualização: infelizmente tive pouco tempo para estudar, menos de 1 semana, não consegui praticar a parte prática do curso com calma, deixei pra fazer o agendamento da prova no último dia, e não tinha mais "slots", então enviei um email e aguardei se havia a posibildiade de realizar em outro dia (pois a data final dia 31 de julho não tinha mais slots) e eles me permitiram fazer alguns dias depois mas eu achava que tinha conseguido dominar o conteúdo teórico o suficiente, infelizmente eu estava enganado, e o resultado vou postar no final deste artigo...


1. Visão Geral do Curso

2. Fundamentos de LLM (Large Language Models)

Um LM é um modelo probabilístico de texto Arquiteturas de LLM, Prompts e Treinamento, Decodificação.

Ontologia do Modelo:

Encoder: modelos que convertem uma sequência de palavras em uma incorporação (representação vetorial). Usos principais: incorporação de tokens, frases e documentos.

Decoder: modelos que recebem uma sequência de palavras e produzem a próxima palavra. Usos principais: geração de texto, modelos de chat (incluindo perguntas e respostas, etc.)

Encoder-Decoder: codifica uma sequência de palavras e usa a codificação + para produzir a próxima palavra.

Prompting: A maneira mais simples de afetar a distribuição do vocabulário é mudar o prompt.

Prompt: O texto fornecido a um LLM como entrada, às vezes contendo instruções e/ou exemplos.

Engenharia de Prompts: O processo de refinar iterativamente um prompt com o objetivo de elicitar um estilo particular de resposta.

A engenharia de prompts é desafiadora, muitas vezes não intuitiva e não garante resultados. Ao mesmo tempo, pode ser eficaz; existem várias estratégias de design de prompt testadas.

Aprendizagem no Contexto:
Condiciona (prompting) um LLM com instruções ou demonstrações da tarefa que ele deve completar.

Prompting de k-shot: Fornecer explicitamente k exemplos da tarefa pretendida no prompt.

Acredita-se amplamente que o prompting de poucos exemplos (few-shot) melhora os resultados em relação ao prompting de zero exemplo (zero-shot).

Estratégias Avançadas de Prompting: Cadeia de Pensamento: solicitar ao LLM que emita etapas intermediárias de raciocínio.

Menos para Mais: solicitar ao LLM que decomponha o problema e resolva, começando pelo mais fácil.

Passo Atrás: solicitar ao LLM que identifique conceitos de alto nível pertinentes a uma tarefa específica.

Problemas com Prompts

Injeção de Prompt (Jailbreaking): Fornecer deliberadamente a um LLM uma entrada que tenta fazê-lo ignorar instruções, causar danos ou se comportar contrariamente às expectativas de implantação.

A injeção de prompt é uma preocupação quando uma entidade externa tem a capacidade de contribuir para o prompt.

Treinamento: O prompting sozinho pode ser inadequado quando: existem dados de treinamento ou é necessária adaptação de domínio.

Adaptação de Domínio: Adaptar o modelo (tipicamente através de treinamento) para melhorar seu desempenho fora do domínio/área de assunto em que foi treinado.

Decodificação: A decodificação é o processo de geração de texto com um LLM.

  • A decodificação ocorre iterativamente, uma palavra por vez.
  • A cada passo da decodificação, usamos a distribuição sobre o vocabulário e selecionamos uma palavra para emitir.
  • A palavra é anexada à entrada e o processo de decodificação continua.

Decodificação Gulosa: Escolha a palavra com a maior probabilidade a cada passo.

Decodificação Não-Determinística: Seleciona aleatoriamente entre candidatos de alta probabilidade a cada passo.

Quando decodificamos, a temperatura é um (hiper)parâmetro que modula a distribuição sobre o vocabulário.

* Quando a temperatura é diminuída, a distribuição fica mais concentrada em torno da palavra mais provável.

* Quando a temperatura é aumentada, a distribuição é achatada sobre todas as palavras.

* Com amostragem ativada, aumentar a temperatura faz com que o modelo se desvie mais da decodificação gulosa.

* A ordem relativa das palavras não é afetada pela temperatura. 

Alucinação: Texto gerado que é não factual e/ou sem fundamento.

  • Existem alguns métodos que afirmam reduzir a alucinação (por exemplo, aumento de recuperação).
  • Não existe uma metodologia conhecida para impedir de forma confiável que os LLMs aluciniem.

Fundamentação: Texto gerado está fundamentado em um documento se o documento suporta o texto.

  • A comunidade de pesquisa abraçou a atribuição/fundamentação.
  • QA atribuída: o sistema deve produzir um documento que fundamenta sua resposta.
  • O modelo TRUE para medir a fundamentação através de NLI (Natural Language Inference).
  • Treinar um LLM para produzir frases com citações.

Aplicações de LLMs

RAG (Retrieval Augmented Generation): Usado principalmente em QA, onde o modelo tem acesso a documentos de suporte (recuperados) para uma consulta.
* Afirma-se reduzir a alucinação.
* QA multi-documento via decodificação sofisticada, e.g., RAG-tok.
* A ideia ganhou muita tração.
-> Usado em diálogo, QA, verificação de fatos, preenchimento de slots, vinculação de entidades.
-> Não paramétrico; em teoria, o mesmo modelo pode responder perguntas sobre qualquer corpus.
-> Pode ser treinado de ponta a ponta.

Modelos de Código: Em vez de treinar em linguagem escrita, treine em código e comentários.
* Copilot, Codex, Code Llama.
* Completa funções parcialmente escritas, sintetiza programas a partir de docstrings, depuração.
* Largamente bem-sucedido: >85% das pessoas que usam copilot se sentem mais produtivas.
* Ótimo ajuste entre dados de treinamento (código + comentários) e tarefas de tempo de teste (escrever código + comentários). Além disso, o código é estruturado -> mais fácil de aprender.

É diferente dos LLMs, que são treinados em uma ampla variedade de texto da internet e usados para muitos propósitos (além de gerar texto da internet); modelos de código têm (arguvelmente) um escopo mais estreito.

Multimodal: Esses são modelos treinados em múltiplas modalidades, e.g., linguagem e imagens.
* Os modelos podem ser autoregressivos, e.g., DALL-E ou baseados em difusão, e.g., Stable Diffusion.
* Modelos de difusão podem produzir uma saída complexa simultaneamente, em vez de token por token.
-> difícil de aplicar ao texto porque o texto é categórico.
-> Algumas tentativas foram feitas; ainda não muito popular.
* Esses modelos podem executar tarefas de imagem para texto, texto para imagem (ou ambos), geração de vídeo, geração de áudio.
* Extensões recentes de aumento de recuperação.

Agentes Linguísticos: Uma área emergente de pesquisa onde baseados em LLM atua.
* Cria planos e "raciocina".
* Toma ações em resposta a planos e ao ambiente.
* São capazes de usar ferramentas.
Alguns trabalhos notáveis nesta área:
* ReAct: framework iterativo onde o LLM emite pensamentos, depois age e observa o resultado.
* Toolformer: Técnica de pré-treinamento onde strings são substituídas por chamadas para ferramentas que produzem resultados.
* Raciocínio bootstrap: Solicita ao LLM que emita racionalização de etapas intermediárias; usa como dados de ajuste fino.











Skill Check: Fundamentals of Large Language Models 01
Answer Test Questions

1. What is the role of temperature in the decoding process of a Large Language Model (LLM)?
  • To determine the number of words to generate in a single decoding step
  • To decide to which part of speech the next word should belong
  • (x) To adjust the sharpness of probability distribution over vocabulary when selecting the next word 
  • To increase the accuracy of the most likely word in the vocabulary
Explanation: When decoding with an LLM, the model assigns probabilities to each word in the vocabulary for the next word in the sequence. Temperature controls the sharpness or smoothness of this probability distribution. A low temperature value results in a sharper distribution, which means that the model is more confident in its predictions and tends to select the most likely word with higher probability. Conversely, a higher temperature value smooths out the distribution, making it more likely for lower probability words to be chosen, leading to more diverse and varied output.

2. What does in-context learning in Large Language Models involve?
  • Pretraining the model on a specific domain
  • Adding more layers to the model
  • Training the model using reinforcement learning
  • (x) Conditioning the model with task-specific instructions or demonstrations
Explanation: In-context learning in Large Language Models (LLMs) involves updating or fine-tuning a pretrained language model with additional data or examples specific to a particular context or domain. This process enables the model to adapt its knowledge and capabilities to better suit the requirements of a specific task or application.

3. Which statement accurately reflects the differences between these approaches in terms of the number of parameters modified and the type of data used?
  • Soft prompting and continuous pretraining are both methods that require no modification to the original parameters of the model.
  • Fine-tuning and continuous pretraining both modify all parameters and use labeled, task-specific data.
  • Parameter Efficient Fine Tuning and Soft prompting modify all parameters of the model using unlabeled data.
  • (x) Fine-tuning modifies all parameters using labeled, task-specific data, whereas Parameter Efficient Fine-Tuning updates a few, new parameters also with labeled, task-specific data.
Explanation: Fine-tuning involves adjusting all parameters of the pretrained model using labeled, task-specific data. This means that the entire model architecture is modified based on the new task or domain-specific data. Parameter Efficient Fine-Tuning updates only a subset of parameters within the pre-trained model, typically focusing on specific layers or components that are relevant to the new task. Despite this, both Fine-tuning and Parameter Efficient Fine-Tuning utilize labeled, task-specific data for training.

4. What does the term "hallucination" refer to in the context of Language Large Models (LLMs)?
  • A technique used to enhance the model's performance on specific tasks
  • (x) The phenomenon where the model generates factually incorrect information or unrelated content as if it were true
  • The process by which the model visualizes and describes images in detail
  • The model's ability to generate imaginative and creative content
Explanation: Hallucination occurs when the model generates text that seems plausible or coherent but is not grounded in factual reality or relevant to the task at hand. Hallucination can be problematic, especially in applications where generating accurate and reliable information is crucial, such as question answering, summarization, or content generation for decision-making.

5. What is prompt engineering in the context of Large Language Models (LLMs)?
  • Iteratively refining the ask to elicit a desired response (x)
  • Adjusting the hyperparameters of the model
  • Training the model on a large data set
  • Adding more layers to the neural network
Explanation: Prompt engineering in the context of Large Language Models (LLMs) refers to the practice of designing and refining prompts or input instructions to elicit desired responses from the model. It involves crafting specific textual cues or queries that guide the model towards generating outputs that align with the user's intentions or requirements.


Skill Check: OCI Generative AI Service Deep Dive

1.What is the purpose of embeddings in natural language processing?
  • (x)To create numerical representations of text that capture the meaning and relationships between words or phrases
  • To translate text into a different language
  • To compress text data into smaller files for storage
  • To increase the complexity and size of text data
Explanation: Embeddings map words or text onto a continuous vector space where similar words are located close to each other. This allows NLP models to capture semantic relationships between words, such as synonyms or related concepts. For example, in a well-trained embedding space, the vectors for "king" and "queen" would be closer to each other than to unrelated words like "car" or "tree." Embeddings also provide a dense, low-dimensional representation of words compared to traditional one-hot encodings. This makes them more efficient and effective as input features for machine learning models, reducing the dimensionality of the input space, and improving computational efficiency.


2.What is the purpose of frequency penalties in language model outputs?
  • To randomly penalize some tokens to increase the diversity of the text
  • (x) To penalize tokens that have already appeared, based on the number of times they have been used
  • To reward the tokens that have never appeared in the text
  • To ensure that tokens that appear frequently are used more often
Explanation: Frequency penalties in language model outputs aim to discourage the repetition of tokens that have already appeared in the generated text. When generating text, language models may tend to produce repetitive phrases or words, which can lead to less diverse and less interesting outputs. By applying frequency penalties, tokens that have been used multiple times are penalized, reducing the likelihood of their repetition in subsequent generations.


3.What is the main advantage of using few-shot model prompting to customize a Large Language Model (LLM)?
  • (x) It provides examples in the prompt to guide the LLM to better performance with no training cost.
  • It eliminates the need for any training or computational resources.
  • It allows the LLM to access a larger data set.
  • It significantly reduces the latency for each model request.
Explanation: The main advantage of using few-shot model prompting to customize a Large Language Model (LLM) is its ability to adapt the model quickly and effectively to new tasks or domains with only a small amount of training data. Instead of retraining the entire model from scratch, which can be time-consuming and resource-intensive, few-shot prompting leverages the model's pre-existing knowledge.


4.What happens if a period (.) is used as a stop sequence in text generation?
  • The model generates additional sentences to complete the paragraph.
  • The model stops generating text after it reaches the end of the current paragraph.
  • (x) The model stops generating text after it reaches the end of the first sentence, even if the token limit is much higher.
  • The model ignores periods and continues generating text until it reaches the token limit.
Explanation: Stop sequences, in the context of text generation, are special tokens or symbols used to signal the end of the generated text. These sequences serve as markers for the model to halt its generation process. Common stop sequences include punctuation marks such as periods (.), question marks (?), and exclamation marks (!), because they typically denote the end of sentences in natural language.


5.Which is a distinctive feature of GPUs in Dedicated AI Clusters used for generative AI tasks?
  • (x) The GPUs allocated for a customer’s generative AI tasks are isolated from other GPUs.
  • Each customer's GPUs are connected via a public Internet network for ease of access.
  • GPUs are shared with other customers to maximize resource utilization.
  • GPUs are used exclusively for storing large data sets, not for computation.
The GPUs allocated for a customer’s generative AI tasks are isolated from other GPUs to maintain the security and privacy of the customer data and workloads.

Skill Check: Building Blocks for an LLM Application

1.What do embeddings in Large Language Models (LLMs) represent?
  • The grammatical structure of sentences in the data
  • (x) The semantic content of data in high-dimensional vectors
  • The color and size of the font in textual data
  • The frequency of each word or pixel in the data
Explanation: Embeddings map words or text onto a continuous vector space where similar words are located close to each other. This allows NLP models to capture semantic relationships between words, such as synonyms or related concepts. For example, in a well-trained embedding space, the vectors for "king" and "queen" would be closer to each other than to unrelated words such as "car" or "tree."

2.What differentiates Semantic search from traditional keyword search?
  • (x) It involves understanding the intent and context of the search.
  • It is based on the date and author of the content.
  • It relies solely on matching exact keywords in the content.
  • It depends on the number of times keywords appear in the content.
Explanation: Semantic search differs from traditional keyword search in that it involves understanding the intent and context of the search query, rather than relying solely on matching exact keywords in the content.

3.Which is a key characteristic of Large Language Models (LLMs) without Retrieval Augmented Generation (RAG)?
  • They cannot generate responses without fine-tuning.
  • They always use an external database for generating responses.
  • (x) They rely on internal knowledge learned during pretraining on a large text corpus.
  • They use vector databases exclusively to produce answers.
Explanation: Large Language Models (LLMs) without Retrieval Augmented Generation (RAG) primarily rely on internal knowledge learned during pretraining on a large text corpus. These models are trained on vast amounts of text data, which enables them to learn complex patterns, structures, and relationships within language.

4.What does the Ranker do in a text generation system?
  • It interacts with the user to understand the query better.
  • It generates the final text based on the user's query.
  • (x) It evaluates and prioritizes the information retrieved by the Retriever.
  • It sources information from databases to use in text generation.
Explanation: The Ranker in a text generation system evaluates and prioritizes the information retrieved by the Retriever. After the Retriever sources relevant information from a large corpus or database, the Ranker assesses the retrieved information to determine its relevance, quality, and suitability for the specific task or context. The Ranker may use various criteria and algorithms to evaluate the retrieved information, such as relevance to the user's query, credibility of the source, recency of the information, and other contextual factors.

5.What is the function of the Generator in a text generation system?
  • To collect user queries and convert them into database search terms
  • To store the generated responses for future use
  • (x) To generate human-like text using the information retrieved and ranked, along with the user's original query
  • To rank the information based on its relevance to the user's query
Explanation: The Generator in a text generation system is responsible for producing human-like text based on the information retrieved and ranked by the system, along with the user's original query or input. After the relevant information has been sourced from external sources by the Retriever and evaluated by the Ranker, the Generator processes this information along with the user's query to generate coherent and contextually appropriate text responses.

Skill Check: Build an LLM Application using OCI Generative AI Service

1.What is LCEL in the context of LangChain Chains?
  • An older Python library for building Large Language Models
  • A programming language used to write documentation for LangChain
  • (x) A declarative way to compose chains together using LangChain Expression Language
  • A legacy method for creating chains in LangChain
Explanation: LECL, or LangChain Expression Language, is a declarative way to compose chains together within the LangChain framework. It provides a structured and expressive syntax for defining the composition of chains, specifying the sequence of components and their interactions in a clear and concise manner.

2.How are prompt templates typically designed for language models?
  • As complex algorithms that require manual compilation
  • (x) As predefined recipes that guide the generation of language model prompts
  • To be used without any modification or customization
  • To work only with numerical data instead of textual content
Explanation: Prompt templates for language models are typically designed as predefined recipes that guide the generation of prompts. By using predefined templates, developers can ensure consistency and coherence in the prompts generated for the language model. These templates can include placeholders or variables representing different components of the prompt, such as user queries, context, or response options.

3.How are chains traditionally created in LangChain?
  • (x) Using Python classes, such as LLM Chain and others
  • Exclusively through third-party software integrations
  • By using machine learning algorithms
  • Declaratively, with no coding required
Explanation: Traditionally, chains are created in LangChain using Python classes, such as LLM Chain. LangChain provides a programming interface that allows developers to define and configure processing pipelines, or chains, using Python code. These chains are typically implemented as classes, where each class represents a specific component or module within the processing pipeline.

4.What is the purpose of memory in the LangChain framework?
  • To act as a static database for storing permanent records
  • (x) To store various types of data and provide algorithms for summarizing past interactions
  • To retrieve user input and provide real-time output only
  • To perform complex calculations unrelated to user interaction
Explanation: In the LangChain framework, memory serves as a dynamic repository for retaining and managing information throughout the system's operation. It allows the framework to maintain state and context, enabling chains to access, reference, and use past interactions and information in their decision-making processes.

5.What is the function of "Prompts" in the chatbot system?
  • (x) They are used to initiate and guide the chatbot's responses.
  • They handle the chatbot's memory and recall abilities.
  • They store the chatbot's linguistic knowledge.
  • They are responsible for the underlying mechanics of the chatbot.
Explanation: Prompts serve as cues or triggers for the chatbot to understand the user's intent and generate appropriate responses. Prompts can take various forms, including questions, commands, or statements, and are designed to elicit specific types of responses from the chatbot.


Practice Exam: OCI Generative AI Professional Certification

1.In which scenario is soft prompting appropriate compared to other training styles?
  • When the model requires continued pretraining on unlabeled data
  • (x) When there is a need to add learnable parameters to a Large Language Model (LLM) without task-specific training
  • When there is a significant amount of labeled, task-specific data available
  • When the model needs to be adapted to perform well in a domain on which it was not originally trained
Explanation: Soft prompting refers to a technique where additional parameters are introduced into a model's input layer in the form of embeddings, which are tuned during training. This can be particularly useful when one wants to adapt a large pretrained model to a new task without modifying the entire model's weights, which is resource-intensive. When there is a significant amount of labeled, task-specific data available, traditional fine-tuning or transfer learning might be more suitable.
When the model needs to be adapted to perform well in a different domain it was not originally trained on, domain adaptation techniques that may involve fine-tuning or prompt-based approaches are commonly used, but this doesn't specifically denote soft prompting.
When the model requires continued pretraining on unlabeled data, unsupervised or self-supervised learning techniques are typically employed, not soft prompting.

2.Accuracy in vector databases contributes to the effectiveness of Large Language Models (LLMs) by preserving a specific type of relationship. What is the nature of these relationships, and why are they crucial for language models?
  • Temporal relationships; necessary for predicting future linguistic trends
  • Linear relationships; they simplify the modeling process
  • Hierarchical relationships; important for structuring database queries
  • (x) Semantic relationships; crucial for understanding context and generating precise language
Explanation: Semantic relationships in vector spaces (often created through methods like word embedding) are foundational to the way LLMs understand and process language. They capture the meaning and context of words or phrases, enabling the model to produce relevant and contextually appropriate responses. If these relationships are accurately preserved in a vector database.

3.What is the purpose of Retrieval Augmented Generation (RAG) in text generation?
  • To generate text based only on the model's internal knowledge without external data
  • To retrieve text from an external source and present it without any modifications
  • (x) To generate text using extra information obtained from an external data source
  • To store text in an external database without using it for generation
Explanation: The purpose of RAG in text generation is to generate text using extra information retrieved from an external source. RAG combines a retrieval step with a generation step: the model first retrieves relevant documents or passages from a large corpus like Wikipedia or a domain-specific dataset and then uses the content of these documents to inform the generation of text.

4.What does accuracy measure in the context of fine-tuning results for a generative model?
  • (x) How many predictions the model made correctly out of all the predictions in an evaluation
  • The depth of the neural network layers used in the model
  • The number of predictions a model makes, regardless of whether they are correct or incorrect
  • The proportion of incorrect predictions made by the model during an evaluation

5.What do prompt templates use for templating in language model applications?
  • Python's list comprehension syntax
  • (x) Python's str.format syntax
  • Python's class and object structures
  • Python's lambda functions
6.Which LangChain component is responsible for generating the linguistic output in a chatbot system?
  • LangChain Application
  • (x) LLMs
  • Vector Stores
  • Document Loaders

7.Which is a characteristic of T-Few fine-tuning for Large Language Models (LLMs)?
  • It increases the training time as compared to Vanilla fine-tuning.
  • It does not update any weights but restructures the model architecture.
  • It updates all the weights of the model uniformly.
  • (x) It selectively updates only a fraction of the model's weights.

8.In the context of generating text with a Large Language Model (LLM), what does the process of greedy decoding entail?
  • (x) Choosing the word with the highest probability at each step of decoding
  • Using a weighted random selection based on a modulated distribution
  • Picking a word based on its position in a sentence structure
  • Selecting a random word from the entire vocabulary at each step

9.When does a chain typically interact with memory in a run within the LangChain framework?
  • Only after the output has been generated
  • (x) After user input but before chain execution, and again after core logic but before output
  • Continuously throughout the entire chain execution process
  • Before user input and after chain execution

10.How does the temperature setting in a decoding algorithm influence the probability distribution over the vocabulary?
  • Decreasing the temperature broadens the distribution, making less likely words more probable.
  • Temperature has no effect on probability distribution; it only changes the speed of decoding.
  • Increasing the temperature removes the impact of the most likely word.
  • (x) Increasing the temperature flattens the distribution, allowing for more varied word choices.

11.Why is it challenging to apply diffusion models to text generation?
  • (x) Because text representation is categorical unlike images
  • Because text generation does not require complex models
  • Because diffusion models can only produce images
  • Because text is not categorical


12.When is fine-tuning an appropriate method for customizing a Large Language Model (LLM)?
  • When the LLM requires access to the latest data for generating outputs
  • When you want to optimize the model without any instructions
  • When the LLM already understands the topics necessary for text generation
  • (x) When the LLM does not perform well on a task and the data for prompt engineering is too large


13.What is LangChain?
  • A Java library for text summarization
  • A JavaScript library for natural language processing
  • A Ruby library for text generation
  • (x) A Python library for building applications with Large Language Models


14.What does the RAG Sequence model do in the context of generating a response?
  • (x) For each input query, it retrieves a set of relevant documents and considers them together to generate a cohesive response.
  • It retrieves relevant documents only for the initial part of the query and ignores the rest.
  • It modifies the input query before retrieving relevant documents to ensure a diverse response.
  • It retrieves a single relevant document for the entire input query and generates a response based on that alone.

15.How are documents usually evaluated in the simplest form of keyword-based search?
  • According to the length of the documents
  • Based on the number of images and videos contained in the documents
  • (x) Based on the presence and frequency of the user-provided keywords
  • By the complexity of language used in the documents
16.Given the following code block: history = StreamlitChatMessageHistory(key="chat_messages")
memory = ConversationBufferMemory(chat_memory=history)
Which statement is NOT true about StreamlitChatMessageHistory?
  • StreamlitChatMessageHistory will store messages in Streamlit session state at the specified key.
  • A given StreamlitChatMessageHistory will not be shared across user sessions.
  • A given StreamlitChatMessageHistory will NOT be persisted.
  • (x) StreamlitChatMessageHistory can be used in any type of LLM application.

17.How does the structure of vector databases differ from traditional relational databases?
  • It uses simple row-based data storage.
  • It is not optimized for high-dimensional spaces.
  • (x) It is based on distances and similarities in a vector space.
  • A vector database stores data in a linear or tabular format.


18.Which statement is true about Fine-tuning and Parameter-Efficient Fine-Tuning (PEFT)?
  • (x) Fine-tuning requires training the entire model on new data, often leading to substantial computational costs, whereas PEFT involves updating only a small subset of parameters, minimizing computational requirements and data needs.
  • PEFT requires replacing the entire model architecture with a new one designed specifically for the new task, making it significantly more data-intensive than Fine-tuning.
  • Fine-tuning and PEFT do not involve model modification; they differ only in the type of data used for training, with Fine-tuning requiring labeled data and PEFT using unlabeled data.
  • Both Fine-tuning and PEFT require the model to be trained from scratch on new data, making them equally data and computationally intensive.

19.What is the purpose of Retrievers in LangChain?
  • To break down complex tasks into smaller steps
  • (x) To retrieve relevant information from knowledge bases
  • To train Large Language Models
  • To combine multiple components into a single pipeline
Explanation: In systems such as LangChain, retrievers are generally used to retrieve relevant information from a corpus of data, such as a knowledge base or the Internet. This information is then used by the system, often in combination with a language model, to generate responses that are informed by the retrieved data. The purpose of retrievers is to enable the system to access a wide range of information and to provide context that can be integrated into the language model's outputs.

20.In the simplified workflow for managing and querying vector data, what is the role of indexing?
  • To categorize vectors based on their originating data type (text, images, audio)
  • To convert vectors into a nonindexed format for easier retrieval
  • (x) To map vectors to a data structure for faster searching, enabling efficient retrieval
  • To compress vector data for minimized storage usage
Explanation: Indexing structures organize data in a way that optimizes retrieval operations, such as nearest neighbor search, which is commonly used for finding the most similar vectors. When dealing with high-dimensional data, like vectors representing complex entities or embeddings, linear search can be slow because it requires comparing the query vector to every vector in the data set.

21.Which statement is true about string prompt templates and their capability regarding variables?
  • They can only support a single variable at a time.
  • They are unable to use any variables.
  • They require a minimum of two variables to function properly.
  • (x) They support any number of variables, including the possibility of having none.


22.What does the Loss metric indicate about a model's predictions?
  • (x) Loss is a measure that indicates how wrong the model's predictions are.
  • Loss measures the total number of predictions made by a model.
  • Loss indicates how good a prediction is, and it should increase as the model improves.
  • Loss describes the accuracy of the right predictions rather than the incorrect ones.
Explanation: Loss is a quantification of the error between what the model predicts and what the actual value or label is. The higher the loss, the greater the error, and conversely, the lower the loss, the more accurate the model’s predictions are with respect to the provided data. During training, the objective is to minimize this loss to improve the model's performance. The loss function guides the optimization algorithm on how to adjust the model's weights to make more accurate predictions.

23.How does a presence penalty function in language model generation?
  • It penalizes all tokens equally, regardless of how often they have appeared.
  • It penalizes only tokens that have never appeared in the text before.
  • It applies a penalty only if the token has appeared more than twice.
  • (x) It penalizes a token each time it appears after the first occurrence.
Explanation: The presence penalty is a value that's added to the log probability of tokens that have already appeared in the text. When the language model is calculating the next token to generate, any token that has appeared previously gets its probability decreased by this penalty value. The more often a token has appeared, the higher the cumulative penalty, thus reducing the likelihood that it will be chosen again.

24.What does a cosine distance of 0 indicate about the relationship between two embeddings?
  • (x) They are similar in direction 
  • They are unrelated
  • They are completely dissimilar
  • They have the same magnitude
Explanation: A cosine distance of 0 between two embeddings indicates that they are perfectly similar in terms of orientation; in other words, they are pointing in the same direction in the vector space. Cosine distance is usually calculated as 1 minus the cosine similarity.

25.How can the concept of "Groundedness" differ from "Answer Relevance" in the context of Retrieval Augmented Generation (RAG)?
  • (x) Groundedness pertains to factual correctness, whereas Answer Relevance concerns query relevance.
  • Groundedness refers to contextual alignment, whereas Answer Relevance deals with syntactic accuracy.
  • Groundedness focuses on data integrity, whereas Answer Relevance emphasizes lexical diversity.
  • Groundedness measures relevance to the user query, whereas Answer Relevance evaluates data integrity.
Explanation: Groundedness usually refers to how well the answer is supported by evidence or data from reputable sources. It's about the factual correctness and the reliability of the underlying data that the answer is based on. An answer is grounded if it is backed up by accurate and trustworthy information.

Answer Relevance, on the other hand, is about how well the answer fits the user's query. It's possible for an answer to be highly relevant to the question asked but not necessarily grounded in factual data. Answer Relevance is concerned with the syntactic and semantic alignment of the response to the query.

Prepare for the exam

 Review exam topics
Objectives% of Exam
 Fundamentals of Large Language Models (LLMs)20%
 Using OCI Generative AI Service45%
 Building an LLM Application with OCI Generative AI Service35%

 

Fundamentals of Large Language Models (LLMs)

  • Explain the fundamentals of LLMs
  • Understand LLM architectures
  • Design and use prompts for LLMs
  • Understand LLM fine-tuning
  • Understand the fundamentals of code models, multi-modal, and language agents

Using OCI Generative AI Service

  • Explain the fundamentals of OCI Generative AI service
  • Use pretrained foundational models for Generation, Summarization, and Embedding
  • Create dedicated AI clusters for fine-tuning and inference
  • Fine-tune base model with custom dataset
  • Create and use model endpoints for inference
  • Explore OCI Generative AI security architecture

Building an LLM Application with OCI Generative AI Service

  • Understand Retrieval Augmented Generation (RAG) concepts
  • Explain vector database concepts
  • Explain semantic search concepts
  • Build LangChain models, prompts, memory, and chains
  • Build an LLM application with RAG and LangChain
  • Trace and evaluate an LLM application
  • Deploy an LLM application





















Resultado final:


Algumas outras certificações grátis da Oracle (após eu me inscrever):
1. Introduction to Oracle Cloud Essentials
https://mylearn.oracle.com/ou/learning-path/introduction-to-oracle-cloud-essentials/115954

2. Become An OCI Foundations Associate (2024)
https://mylearn.oracle.com/ou/learning-path/become-an-oci-foundations-associate-2024/139374

3. Become An OCI AI Foundations Associate (2024):
https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2024/140164

Nenhum comentário:

Postar um comentário

Postagens mais visitadas