Components
In the context of an artificial intelligence system or a machine learning model, there are several key components and concepts that work together to create a functional and efficient workflow.
Response
The component responsible for sending the list of messages, which has been created through transformations and data collection, to the neural network.
Fantasy: This is a parameter that determines the degree of creativity of the model. Higher values like 8 make the model's output more unpredictable and innovative, while lower values like 2 make it more focused and predictable.
Response Length (MAX): This parameter establishes the maximum number of tokens (units of information) that the model's response can contain. It essentially determines the maximum length of the response generated by the model.
System Message: This parameter serves to customize the behavior and characteristics of the model. This is where we define the personality, objective, and general guidelines for the model. For example, we might define the model as a helpful virtual assistant, or as a bot that emulates the sarcastic behavior of a character named Marco, or as a corporate chatbot designed to answer exclusively questions about sector X. These parameters are fundamental to refine the model's capabilities, making it a versatile and adaptable assistant, capable of effectively responding to a wide range of requirements.
Request
This component plays a crucial role in the communication process with the chat. Its task is to extrapolate and filter the conversation from the chat, ensuring to appropriately manage the volume of data so as not to exceed the token limit that the neural network can handle.
This component operates in two stages. The first stage involves retrieving the entire conversation from the chat. This includes not only the user's messages, but also the model's previous responses, as both types of messages provide context for the subsequent response. The second stage involves filtering the messages. Due to technical limitations, a neural network can only handle a specific number of tokens at a time. This component, therefore, must ensure that the total number of tokens in the messages does not exceed this limit. To do this, it can adopt different strategies. For example, it might choose to truncate or delete some messages, or it might choose to rephrase or compress the messages to reduce their number of tokens. In short, this component acts as an effective management interface between the user, the chat, and the neural network, ensuring a smooth and optimized communication flow.
Serialize
This component has the crucial task of transforming a list of messages into a single serialized text. The serialization process converts multiple messages, each with its own individuality and content, into a continuous and sequential data stream.
To understand better, let's imagine we have a series of separate messages: Message 1, Message 2, Message 3, and so on. Each message has its own identity and function within the conversation. However, to facilitate the processing by the neural network, these messages need to be converted into a more manageable format. This is where the component we are discussing comes into play. This component takes each message and combines it with the others into a single text. This doesn't simply mean putting the messages together one after the other. Serialization implies creating a data stream that maintains the sequence and context of the original messages. The result is a single text that contains all the relevant information from the original messages, presented in sequence. This serialized text is then ready to be sent to the neural network for further processing.
Last
This component has the specific task of extracting the text of the last message from the list of messages in a conversation. This function is particularly important to maintain the current and relevant context of the conversation.
When a conversation is ongoing, a series of messages are exchanged. Each message contributes to the thread of the conversation, adding new information, responding to previous questions, or introducing new topics. However, the last message in a conversation often has particular importance, as it represents the most recent request or the most current point in the discussion. This is where this component comes in. Its task is to identify and extract the text of this last message. This is not as simple a task as it may seem. It requires the ability to navigate through the list of messages, correctly identify the last message, and then accurately extract its text. Once the text has been extracted, it can be used for various purposes. For example, it can be sent to the neural network to generate an appropriate response, or it can be used to update the context of the conversation.
List
This particular component of the system performs an essential function in managing and organizing a set of messages or 'N messages'. Its central task is to take these messages and arrange them in a structured list, based on the connections between the messages. This sorting process can be vital in maintaining the consistency and logical flow of the conversation.
If we assume we have a set of messages labeled as A, B, C, and so on. Each message might represent a specific interaction or discussion point within a conversation. However, the order in which these messages are received or processed may not necessarily be sequential or logical. This is where this component shows its value. It analyzes each message and identifies the connections between them. These connections could be based on a variety of factors, such as the content of the message, the time the message arrived, or the response to a previous message. Once these connections are identified, the component proceeds to reorder the messages in a structured list. This ordering ensures that the messages are presented in a way that respects the logical and temporal flow of the conversation.
The PDF component is a specific element within an artificial intelligence system or software that is tasked with generating a document in PDF format.
The PDF is managed by the VECTOR DB.
To create it, you will need to go to the Dynamic context section and create a new Vector DB. Once your specific PDF has been created, it will be listed among the entries to be selected.
CSV
This component plays a critical role in interfacing a .CSV file with the model. .CSV (Comma-Separated Values) files are a type of data file that store information in tabular form, with each field separated by a comma. These files are widely used for data management due to their simplicity and portability.
Interfacing a .CSV file within the model is a process that requires accuracy and precision. First, the component must open and read the .CSV file. This operation can include decoding the file, handling any errors or compatibility issues, and correctly interpreting the data contained within it. Once the data has been read, the component must transform this data into a format that the model can understand and use. This could mean converting the data into specific data types, mapping the data into certain structures, or applying other transformations depending on the model's needs. After the data has been adequately prepared, the component then forwards them to the model. This can involve feeding the data to the model in certain ways, such as inserting the data into specific parts of the model, or adapting the data to the model's specific functions.
Text
This component has the crucial role of manipulating and managing previous results during run-time, merging various pieces of text. This function is particularly important for creating coherent and relevant responses based on the context of a conversation or previous data.
Run-time refers to the period of time when a program or system is running. During run-time, various processes occur, including receiving input data, processing data, and generating output data. Now, talking about the specific component, its task is to manage and manipulate results generated earlier during run-time. These results could be output data from previous processing operations or responses previously generated by the model. The component takes these results and merges them with various pieces of text. This merging process can serve various purposes. For example, it could be used to create a more detailed or complex response, combining various parts of information. Or it could be used to maintain consistency and context in a conversation, linking the current response with previous information. The merging process requires careful management and manipulation of data. The component must be able to identify which pieces of text need to be merged, and how they should be merged to maintain consistency and relevance. Moreover, the component must be able to perform these operations efficiently during run-time, to ensure a smooth and timely communication flow.
Logic
The LOGIC component that deals with adding conditional logics is a fundamental element of an artificial intelligence system or software. This component is responsible for introducing rules and conditions that guide the system's operation, enabling it to make decisions based on specific criteria or conditions.
Conditional logics are essentially "if, then" instructions that determine how a system should react based on certain situations or inputs. For example, in a virtual assistant, a conditional logic might be: "If the user asks for the time, then respond with the current time." These instructions guide the system to perform certain actions based on specific inputs or conditions. The LOGIC component constantly works to evaluate these conditions and determine which action to take. This requires advanced processing skills, as the system must be able to correctly interpret the input, evaluate the applicable conditions, and then execute the appropriate action. Also, the LOGIC component can handle more complex conditional logics. These can include nested instructions, where several conditions need to be met, or "if, then, else" instructions, where different actions can be taken depending on different conditions.
Create User Message
The "CREATE USER MESSAGE" component is a crucial element in the design of a virtual assistant or an artificial intelligence system. This component is tasked with gathering, structuring, and preparing the user's input to be processed by the system.
The create user message component has the task of creating a message object defining its role in the conversation. In this case, a user message is created, which means that when the network analyzes the conversation to respond, this message will be considered as written by the user.
Create Assistant Message
The "CREATE ASSISTANT MESSAGE" component has the task of generating responses or messages from the assistant.
The assistant message component is an essential part in the context of the conversation. It has the function of creating a message object, which has the task of defining the role that the specific message has within the conversation. In this case, an "assistant" type message is generated. This means that, once the neural network model proceeds to the analysis of the conversation to provide the most appropriate response, such a message will be interpreted as if it was written by the virtual assistant himself. This distinction is fundamental for the correct functioning of the AI model because it allows clearly distinguishing the user's contributions from those of the assistant. Moreover, it helps the AI understand the context and flow of the conversation, allowing it to generate consistent and relevant responses. Furthermore, the model takes into account not only the content of each message but also the order in which the messages were written. This allows the AI to maintain a logical thread throughout the conversation, responding appropriately based on what has been previously said by both the user and the assistant. Therefore, the creation of an "assistant" type message is a key aspect to ensure a smooth and natural interaction between the user and the virtual assistant, contributing to creating a more realistic and engaging conversation experience for the user.
Create Function Message
The "CREATE FUNCTION MESSAGE" component is a key element of the system that manages the creation of functional or output messages within a virtual assistant or an artificial intelligence system.
This component is responsible for processing the results of a specific action or function and transforming them into a structured message that can be presented to the user. The create message component has the task of creating a message object defining its role in the conversation. In this case, a function type message is created, meaning that when the network analyzes the conversation to respond, this message will be considered as a result of execution and the network will know that it must use this message as an absolute truth and must use it to respond.
CONTENT - content of the message.
NAME - name of the function to recall which function was called to get the response.
Dynamic Context
In the "General" section of the dynamic contexts, you can configure various key information concerning the model.
Settings
More specific sections for the model.
Model Icon: Here you can upload an icon that visually represents the model. The icon can be an image or logo that uniquely identifies the model.
Title: This field allows you to enter the model's title. The title should be short and descriptive, providing users with a clear overview of the model.
Description: Here you can provide a more detailed description of the model. The description can include information about the features, functionalities, and uses of the model, to provide users with a deep understanding of what to expect.
BIO (longer): This field allows you to enter a biography or a more extensive description of the model.
Models
More specific sections for the model.
In the "Model" section of the dynamic contexts, you can customize the model's behavior by entering a specific prompt. The prompt is the sentence or question presented to the model to guide its processing and generate a response. You can write a prompt suitable for the specific context in which you intend to use the model, to obtain more accurate and relevant answers. The "Share" section allows you to share the model with other people or users. In this way, they can access the model and use it to get answers to their questions or to experience different interactions. Finally, in the "Delete" section, you have the possibility to delete the model. By correctly configuring these sections, you can customize the model, share it with other users, and manage its removal, to suit your needs and ensure an optimal experience in using dynamic contexts.
Plug-in
An extension that is used to enhance the functionalities of CINZIA.
The component called "layer" is a data structure in JSON format. This structure defines the parameters that a certain function can accept. But its role doesn't stop there: the layer also has the task of extracting this information from the conversation and formatting it so that it can be correctly interpreted and processed by the system. Within the layer, each plugin is identified by a unique ID. This ID serves to uniquely describe the plugin and distinguish it from all others. Subsequently, there is a field called "type", which specifies the data type required for the parameter. This can vary: "string" indicates that the parameter must be a text; "number" means it must be a number; "array", instead, specifies that the parameter must be a list. The "description" field provides further details on the parameter. In particular, it describes what the parameter should contain and how it should be formatted. These indications are crucial to ensure that the parameter is correct and can be used effectively by the system. Finally, the "required" field presents the list of parameters that are mandatory. These must necessarily be provided to ensure the correct functioning of the system.
Vector DB
The Vector DB is a component that allows uploading PDF files into the system to use them later in a model or other operations. Once the PDF files have been uploaded into the Vector DB, the system can read and analyze the content of such files to extract information or perform specific operations.
Chunk length: This function allows defining the size of each text block (chunk). This can help to keep the document organized and coherent, ensuring that each text block is of the same size.
Redundancy: This function allows defining the amount of overlap between the text blocks. This is useful to ensure that no important information is lost between one block and another, and that there is a certain amount of repetition to reinforce key points.
Filter: This function is particularly useful when working with a large number of results. It allows filtering the results based on a specific score. For example, you might ask the system to show you only the top 10 results with a score above 0.7.
TOP K: This function allows limiting the number of results returned by the system. If you set K to 10, the system will only show you the top 10 results. This can help manage a large number of results, showing you only the most relevant or useful ones.
Remember, all of these parameters are configurable based on your specific needs to help you better manage your PDF files.
Survival guide to prompts
This guide shares strategies and tactics to get better results from Cinzia. The methods described here can sometimes be combined for a greater effect. We encourage experimentation to find the methods that work best for you.
Six strategies for achieving better results:
Write clear instructions.
Cinzia can't read your mind. If the outputs are too long, ask for shorter responses. If the outputs are too simple, ask for expert-level writing. If you don't like the format, show the format you would like to see. The less Cinzia has to guess what you want, the more likely you are to get it.
Tactics:
- Include details in your query to get more relevant answers
- Ask the model to adopt a personality
- Use delimiters to clearly indicate distinct parts of the input
- Specify the necessary steps to complete a task
- Provide examples
- Specify the desired length of the output
Provide reference text
Cinzia can confidently invent false answers, especially when it comes to esoteric topics or for quotes and URLs. Just like how a cheat sheet can help a student do better on a test, providing a reference text to Cinzia can help answer with less inventions.
Tactics:
- Instruct the model to respond using a reference text
- Instruct the model to respond with quotes from a reference text
Break down complex tasks into simpler sub-tasks
Just like it's good practice in software engineering to break down a complex system into a set of modular components, the same goes for tasks submitted to Cinzia. Complex tasks tend to have higher error rates than simpler tasks. Additionally, complex tasks can often be redefined as a workflow of simpler tasks where the outputs of the previous tasks are used to build the inputs of the subsequent tasks.
Tactics:
- Use intent classification to identify the most relevant instructions for a user query
- For dialogue applications that require very long conversations, summarize or filter the previous dialogue
- Summarize long documents piece by piece and build a complete summary recursively
Give Cinzia time to "think"
If you were asked to multiply 17 by 28, you might not know it instantly, but you can still calculate it given time. Similarly, Cinzia makes more reasoning errors when trying to answer quickly, instead of taking the time to process an answer. Asking for a chain of reasoning before an answer can help Cinzia reason more reliably towards correct answers.
Tactics:
- Instruct the model to work out its solution before rushing to a conclusion
- Use inner monologue or a sequence of queries to hide the model's reasoning process
- Ask the model if it missed something in the previous steps
Use external tools
Compensate for Cinzia's weaknesses by feeding it with the outputs of other tools. For example, a text retrieval system can inform Cinzia about relevant documents. A code execution engine can help Cinzia with math and executing code.
Tactics:
- Use embedding-based search to implement efficient knowledge retrieval
- Use code execution to perform more accurate calculations or call external APIs.
Test changes systematically
Improving performance is easier if you can measure it. In some cases, a change to a prompt will get better performance on some isolated examples but lead to worse overall performance on a more representative set of examples. Therefore, to be sure that a change is positive for performance, it may be necessary to define a comprehensive test suite (also known as an "eval").
Tactics:
Tactics:
Each strategy listed above can be instantiated with specific tactics. These tactics are meant to provide ideas on what to try. They are not exhaustive, and you are free to try creative ideas that are not represented here.
Strategy: Write Clear Instructions
Tactic: Include details in your request to get more relevant responses.
To get a highly relevant response, make sure that requests provide any important details or context. Otherwise, you are leaving it up to the model to guess what you mean.
Worst
Best
How to add numbers in Excel?
How to sum a row of dollar amounts in Excel? I want to do it automatically for the entire sheet of rows, with all the totals ending on the right in a column called "Total".
Who is the president?
Who was the Italian president in 2021 and how often are elections held?
Write code to calculate the Fibonacci sequence.
Write a TypeScript function to efficiently calculate the Fibonacci sequence. Comment the code freely to explain what each part does and why it was written that way.
Summarize the meeting notes.
Summarize the meeting in a single paragraph. Then write a markdown list of the speakers and their key points. Finally, list any suggested next actions or tasks from the speakers, if present.
Tactic: Ask the model to adopt a persona
The system message can be used to specify the persona adopted by the model in its responses.
System
When I ask for help in writing something, you will respond with a document that contains at least one witty quip or comment in every paragraph.
User
Write a thank-you note to my steel bolt supplier for delivering the material on time and with short notice. This made it possible to fulfill an important order.
Tactic: Use delimiters to clearly indicate distinct parts of the input
Delimiters such as triple quotes, XML tags, section headings, etc. can help to demarcate sections of text to be treated differently.
User
Synthesize the text enclosed by triple quotes into a haiku.
"""Insert text here"""
System
You will be provided with a pair of articles (enclosed by XML tags) on the same topic. First, summarize the arguments of each article. Then, indicate which of the two articles presents a better argument and explain why.
User
<article> insert the first article here </article>
<article> insert the second article here </article>
System
You will be given a thesis abstract and a suggested title for it. The thesis title should give the reader a good idea of the thesis topic but should also be catchy. If the title does not meet these criteria, suggest 5 alternatives.
User
Abstract: insert the abstract here
Title: insert the title here
For simple tasks like these, using delimiters might not make a difference in the output quality. However, the more complex the task, the more important it is to disambiguate the task details. Don't make Cinzia work to figure out exactly what you're asking her.
Tactic: Specify the necessary steps to complete a task
Some tasks are better specified as a sequence of steps. Writing the steps explicitly can make it easier for the model to follow them.
System
Use the following step-by-step instructions to respond to user inputs.
Step 1 - The user will provide you with text enclosed in triple quotes. Synthesize this text into a single sentence with a prefix that says "Summary: ".
Step 2 - Translate the summary from Step 1 into Spanish, with a prefix that says "Translation: ".
User
"""insert text here"""
Tactic: Provide examples
Providing general instructions that apply to all examples is generally more efficient than demonstrating all permutations of a task through examples, but in some cases, providing examples can be easier. For instance, if you intend the model to copy a particular response style to user requests that is difficult to describe explicitly. This is known as "few-shot" prompting.
System
Respond in a consistent style.
User
Teach me patience.
Assistant
modest
User
Teach me about the ocean.
Tactic: Specify the desired output length
You can ask the model to produce output of a given target length. The targeted output length can be specified in terms of word count, sentences, paragraphs, bullet points, etc. However, note that instructing the model to generate a specific number of words does not work with high precision. The model can generate output with a specific number of paragraphs or bullet points more reliably.
User
Summarize the text enclosed by triple quotes in about 50 words.
"""Insert text here"""
User
Summarize the text enclosed by triple quotes in 2 paragraphs.
"""Insert text here"""
User
Summarize the text enclosed by triple quotes in 3 bullet points.
"""Insert text here"""
Strategy: Provide reference text
Tactic: Instruct the model to answer using a reference text
If we can provide the model with reliable information relevant to the current query, we can instruct the model to use the provided information to compose its response.
System
Use the articles provided within triple quotes to answer the questions. If the answer cannot be found in the articles, write "I was unable to find an answer."
User
<insert articles, each enclosed by triple quotes>
Question: <insert the question here>
Considering that Cinzia has limited context windows, to apply this tactic we need a way to dynamically search for information relevant to the question asked. Embeddings can be used to implement efficient knowledge retrieval. Refer to the tactic "Using embedding-based search to implement efficient knowledge retrieval" for further details on how to implement this solution.
Tactic: Instruct the model to answer with quotes from a reference text
If the input has been augmented with relevant knowledge, it's easy to request that the model adds quotes to its answers by referencing passages from the provided documents. Note that quotes in the output can then be programmatically verified by comparing strings in the provided documents.
System
You will be given a document enclosed by triple quotes and a question. Your task is to answer the question using only the provided document and to quote the passage(s) from the document used to answer the question. If the document does not contain the information needed to answer this question, simply write: "Insufficient information". If an answer to the question is provided, it must be annotated with a citation. Use the following format to cite relevant passages ({"quote":...}).
User
"""<insert document here>"""
Question: <insert the question here>
Strategy: Break down complex tasks into simpler subtasks
Tactic: Use intent classification to identify the most relevant instructions for a user request
For tasks where many independent sets of instructions are needed to handle different cases, it can be helpful to first classify the type of request and use that classification to determine which instructions are needed. This can be achieved by defining fixed categories and encoding relevant instructions to handle tasks within a given category. This process can also be applied recursively to decompose a task into a sequence of stages. The advantage of this approach is that each request will contain only those instructions needed to perform the next stage of a task, which can lead to lower errors compared to using a single request to perform the entire task. This can also result in lower costs since larger prompts cost more (see pricing information).
For example, suppose that for a customer support application, requests might usefully be classified as follows:
System
You will be provided with customer support requests. Classify each request into a primary and secondary category. Provide your output in json format with
the keys: primary and secondary.
Primary Categories: Billing, Technical Support, Account Management, or General Inquiry.
Billing Secondary Categories:
-Cancellation or upgrade
-Add a payment method
-Explain a charge
-Dispute a charge
Technical Support Secondary Categories:
-Troubleshooting
-Device compatibility
-Software updates
Account Management Secondary Categories:
-Reset password
-Update personal information
-Close account
-Account security
General Inquiry Secondary Categories:
-Product information
-Pricing
-Feedback
-Speak with a human
User
I need to get my internet connection working again.
Based on the classification of the customer request, a more specific set of instructions can be provided to Cinzia to handle subsequent steps. For example, suppose the customer needs help with "troubleshooting".
System
You will be provided with customer support requests that require troubleshooting in a technical support context. Help the user:
-Ask them to check that all cables to/from the router are connected. Note that it is common for cables to become loose over time.
-If all cables are connected and the problem persists, ask them which router model they are using.
-You will now advise them on how to reboot their device:
-- If the model number is MTD-327J, advise them to press the red button and hold it down for 5 seconds, then wait 5 minutes before testing the connection.
-- If the model number is MTD-327S, advise them to unplug it and plug it back in, then wait 5 minutes before testing the connection.
-If the customer's problem persists after rebooting the device and waiting for 5 minutes, connect them to IT support by producing {"IT support requested"}.
-If the user starts asking unrelated questions to this topic, confirm if they want to end the current chat on troubleshooting and classify their request according to the following scheme:
<insert primary/secondary classification scheme mentioned above here>
User
I need to get my internet working again.
Note that the model has been instructed to emit special strings to indicate when the conversation state changes. This allows us to turn our system into a state machine where the state determines which instructions are injected. By keeping track of the state, relevant instructions in that state, and optionally, allowed state transitions from that state, we can create constraints in the user experience that would be difficult to achieve with a less structured approach.
Tactic: For dialogue applications that require very long conversations, summarize or filter the previous dialogue
Since Cinzia has a fixed context length, the dialogue between a user and an assistant where the entire conversation is included in the context window cannot continue indefinitely.
There are various solutions to this problem, one of which is summarizing previous turns in the conversation. Once a predetermined input length is reached, this could trigger a query that summarizes part of the conversation, and the summary of the previous conversation could be included as part of the system message. Alternatively, the previous conversation could be summarized asynchronously in the background throughout the entire conversation.
An alternative solution is to dynamically select the most relevant previous parts of the conversation for the current query. See the tactic "Using embedding-based search to implement efficient knowledge retrieval".
Tactic: Summarize long documents piece by piece and build a complete summary recursively
Since Cinzia has a fixed context length, it cannot be used to summarize text longer than the context length minus the length of the generated summary in a single query.
To summarize a very long document like a book, we can use a sequence of queries to summarize each section of the document. The section summaries can be concatenated and summarized, producing summaries of summaries. This process can proceed recursively until the entire document is summarized. If it is necessary to use information about previous sections to understand subsequent sections, then an additional useful trick is to include an ongoing summary of the text preceding a given point in the book while summarizing the content at that point.
Strategy: Give Cinzia time to "think"
Tactic: Instruct the model to process its own solution before hastily arriving at a conclusion
Sometimes we get better results when we explicitly instruct the model to reason from first principles before arriving at a conclusion. For example, suppose we want a model to evaluate a student's solution to a math problem. The most obvious way to approach this is to simply ask the model whether the student's solution is correct or not.
System
Determine whether the student's solution is correct or not.
User
Problem: I'm trying to set up a solar power plant and I need help calculating the costs.
-The land costs €100 per square meter
-I can purchase solar panels for €250 per square meter
-I have negotiated a maintenance contract that will cost me a fixed sum of €100,000 per year and an additional €10 per square meter
What is the total cost for the first year of operation depending on the number of square meters?
Student's Solution: Let x be the size of the plant in square meters.
1.Cost of land: 100x
2.Cost of solar panels: 250x
3.Cost of maintenance: 100,000 + 10x
Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000
Assistant
The student's solution is correct.
Ma la soluzione dello studente in realtà non è corretta! Possiamo far sì che il modello se ne accorga con successo, chiedendo al modello di generare prima la propria soluzione.
System
First, work out your own solution to the problem. Then compare your solution to the student's and evaluate whether the student's solution is correct or not. Do not decide whether the student's solution is correct until you have solved the problem yourself.
User
Problem: I'm trying to set up a solar power plant and I need help calculating the costs.
-The land costs €100 per square meter
-I can purchase solar panels for €250 per square meter
-I have negotiated a maintenance contract that will cost me a fixed sum of €100,000 per year and an additional €10 per square meter
What is the total cost for the first year of operation depending on the number of square meters?
Student's Solution: Let x be the size of the plant in square meters.
1.Cost of land: 100x
2.Cost of solar panels: 250x
3.Cost of maintenance: 100,000 + 10x
Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000
Assistant
Let x be the size of the plant in square meters.
1.Cost of land: 100x
2.Cost of solar panels: 250x
3.Cost of maintenance: 100,000 + 10x
Total cost: 100x + 250x + 100,000 + 10x = 360x + 100,000
The student's solution is incorrect. They made an error in calculating the maintenance cost, using 100x instead of 10x. The correct total cost for the first year of operation is 360x + 100,000.
Tactic: Use inner monologue or a sequence of queries to hide the model's reasoning process
The previous tactic shows that sometimes it's important for the model to reason in detail about a problem before answering a specific question. For some applications, the reasoning process a model uses to arrive at a final answer might be inappropriate to share with the user. For example, in tutoring applications, we might want to encourage students to find their own answers, but the model's reasoning process about the student's solution could reveal the answer to the student themselves.
Inner monologue is a tactic that can be used to mitigate this. The idea of inner monologue is to instruct the model to insert parts of the output that should be hidden from the user in a structured format that makes analyzing it easier. Then, before presenting the output to the user, the output is analyzed and only a portion of the output is made visible.
Sistema
Follow these steps to answer the user's questions.
Step 1 - First, work out your own solution to the problem. Do not rely on the student's solution, as it may be incorrect. Enclose all your work for this step between triple quotes (""").
Step 2 - Compare your solution to the student's and evaluate whether the student's solution is correct or not. Enclose all your work for this step between triple quotes (""").
Step 3 - If the student made a mistake, determine what hint you could give the student without revealing the answer. Enclose all your work for this step between triple quotes (""").
Step 4 - If the student made a mistake, provide the hint from the previous step to the student (outside of the triple quotes). Instead of writing "Step 4 - ..." write "Hint:".
User
Problem: <insert the problem>
Student's Solution: <insert the student's solution>
Alternatively, this can be achieved with a sequence of queries where all but the last have their output hidden from the end user.
First, we can ask the model to solve the problem on its own. Since this first query doesn't require the student's solution, it can be left out. This provides the added benefit that there is no chance the model's solution will be influenced by the student's attempt at a solution.
User
<insert the problem description>
Next, we can have the model use all available information to evaluate the correctness of the student's solution.
System
Compare your solution to the student's and evaluate whether the student's solution is correct or not.
User
Problem statement: """<insert the problem statement>"""
Your solution: """<insert the solution generated by the model>"""
Student's solution: """<insert the student's solution>"""
Finally, we can have the model use its own analysis to construct a response in the form of an available tutor.
System
You are a math tutor. If the student has made a mistake, offer them a hint in a way that doesn't reveal the answer. If the student has not made any mistakes, offer them an encouraging comment.
User
Problem statement: """<insert the problem statement>"""
Your solution: """<insert the solution generated by the model>"""
Student's solution: """<insert the student's solution>"""
Analysis: """<insert the analysis generated by the model from the previous step>"""
Tactic: Ask the model if it has overlooked anything in previous steps
Suppose we are using a model to list excerpts from a source that are relevant to a specific question. After listing each excerpt, the model must determine whether to start writing another excerpt or to stop. If the source document is large, it's common for a model to stop too early and fail to list all relevant excerpts. In this case, better performance can be achieved by prompting the model with follow-up questions to find any excerpts it may have overlooked in previous steps.
System
You will be provided with a document enclosed by triple quotes. Your task is to select excerpts that pertain to the following question: "What significant paradigm shifts have occurred in the history of artificial intelligence?"
Ensure the excerpts contain all the relevant context necessary to interpret them – in other words, do not extract small fragments lacking important context. Provide the output in JSON format as follows:
[{"excerpt": "..."},
...
{"excerpt": "..."}]
User
"""<insert the document here>"""
Assistant
[{"excerpt": "the model writes an excerpt here"},
...
{"excerpt": "the model writes another excerpt here"}]
User
Are there any more relevant excerpts? Avoid repeating excerpts. Also, make sure the excerpts contain all the relevant context necessary to interpret them, meaning do not extract small fragments lacking important context.
Strategy: Use external tools
Tactic: Use embedding-based search to implement efficient knowledge retrieval
A model can leverage external sources of information if provided as part of its input. This can help the model generate more informed and up-to-date responses. For example, if a user asks a question about a specific movie, it might be helpful to add high-quality information about the movie (e.g., actors, director, etc.) to the model's input. Embeddings can be used to implement efficient knowledge retrieval, so that relevant information can be added to the model's input dynamically during runtime.
A text embedding is a vector that can measure the correlation between text strings. Similar or relevant strings will be closer together than unrelated strings. This fact, along with the existence of fast vector search algorithms, means that embeddings can be used to implement efficient knowledge retrieval. Specifically, a text corpus can be broken into fragments, and each fragment can be embedded and stored. Then, a given query can be embedded, and a vector search can be performed to find the text fragments embedded in the corpus that are most related to the query (i.e., closest in embedding space).
Implementation examples can be found in the OpenAI Cookbook. See the tactic "Instruct the model to use retrieved knowledge to answer queries" for an example of how to use knowledge retrieval to minimize the likelihood that a model will invent incorrect facts.
Tactic: Use code execution to perform more accurate calculations or call external APIs
Cinzia cannot be relied upon to perform arithmetic or lengthy calculations accurately on its own. In cases where this is necessary, a model can be instructed to write and execute code instead of performing its own calculations. Specifically, a model can be instructed to insert the code that needs to be executed in a designated format such as triple curly braces. After producing an output, the code can be extracted and executed. Finally, if necessary, the output from the code execution engine (e.g., Python interpreter) can be provided as input to the model for the subsequent query.
Sistema
You can write and execute Python code by enclosing it in triple curly braces, e.g.
```code goes here```. Use this to perform calculations.
User
Find all real roots of the following polynomial: 3*x**5 - 5*x**4 - 3*x**3 - 7*x - 10.
Another good use case for code execution is calling external APIs. If a model is instructed on the correct usage of an API, it can write code that makes use of it. A model can be instructed on how to use an API by providing it with documentation and/or code examples that show how to use the API.
Sisitema
You can write and execute Python code by enclosing it in triple curly braces. Additionally, note that you have access to the following module to help users send messages to their friends:
```python
import message
message.write(to="John", message="Hey, do you want to meet up after work?")```
WARNING: Running code produced by a model is not inherently safe, and precautions should be taken in any application that intends to do so. In particular, a sandboxed code execution environment is needed to limit the damage that untrusted code might cause.
Strategy: Test changes systematically
Sometimes it can be difficult to understand whether a change – for example, a new instruction or a new design – makes your system better or worse. By looking at a few examples, you might get a sense of which one is better, but with small sample sizes, it can be difficult to distinguish between a real improvement or random luck. Perhaps the change improves performance on some inputs but harms performance on others.
Evaluation procedures (or "evals") are helpful for optimizing system designs. Good evals are:
- Representative of real-world usage (or at least diverse)
- Contain many test cases for greater statistical power (see the table below for guidelines)
- Easy to automate or repeat
DIFFERENCE TO BE DETECTED
SAMPLE SIZE NEEDED FOR 95% CONFIDENCE LEVEL
30%
~10
10%
~100
3%
~1,000
1%
~10,000
Evaluating outputs can be done by computers, people, or a combination of both. Computers can automate evals with objective criteria (e.g., questions with single correct answers) and some subjective or nuanced criteria, where the model's outputs are evaluated by other model queries. OpenAI Evals is an open-source software framework that provides tools for creating automated evals.
Model-based evals can be useful when there is a range of possible outputs that would be considered equally high quality (e.g., for questions with long answers). The boundary between what can realistically be evaluated with a model-based eval and what requires human evaluation is blurred and constantly changing as models become more capable. We encourage experimentation to understand how well model-based evals can work in your use case.
Tactic: Evaluate model outputs against reference answers
Try this tactic with Cinzia Dev Mode