Patent classifications
G06F16/33295
AI-DRIVEN MULTI-AGENT SYSTEM FOR COMPREHENSIVE NETWORK, SECURITY AND ENTERPRISE IT OPERATIONS
Disclosed herein are system, apparatus, article of manufacture, method and/or computer program product aspects, and/or combinations and sub-combinations thereof, for generating a response to a user question. An example embodiment operates by querying a large language model (LLM) using a prompt associated with the user question and one or more available tools in a tool service. The embodiment then retrieves an endpoint from the tool service in which the endpoint is a digital location associated with a tool selected by the LLM. The embodiment then requests a database associated with the user question based on the endpoint meeting an endpoint requirement. The embodiment then executes a query at the database, thereby obtains data including the response to the user question. The embodiment then identifies an output format associated with the response. The embodiment then generates the response to the user question based on formatting the data using the output format.
LANGUAGE MODEL TOOL CALLING AND EXECUTION PLATFORM
A system for processing client requests in an AI ecosystem is provided. The system may receive a client request from a client application, where the client request is based upon a user request. The system may provide a model request, based upon the client request, to a first model (e.g., an LLM), receive, from the first model, a structured response based upon the model request, and cause execution of tool functions based upon the structured response.
System for Automatically Evaluating the Output of Machine-Learned Models
Provided is a system that automatically evaluates the output of machine-learned models. A computing system receives, from a user computing device, an input query. The computing system processes the input query with a generative model to generate a model output based on the input query. The computing system identifies one or more representative subsequences that correspond to a representation based on the textual response. The computing system generates a plurality of tuple pairs based on the one or more representative subsequences that correspond to a representation and the one or more media elements. For each of the relevant tuple pairs, the computing system processes the respective tuple pair with an entailment-scoring machine-learned model to generate an entailment score for the respective tuple pair. The computing system provides an entailment output for the model output based on the respective entailment scores generated for the one or more relevant tuple pairs.
ANSWER ASSISTANCE COMPUTING SYSTEM
Technology is disclosed for programmatically generate answers for a user that are responsive to aspects of a conversation. In one implementation, a conversation record is processed to determine a message embedding of a most recent message received. The message embedding is used to determine a semantically similar question embedding of a conversational snippet from a knowledge base. An answer-generation input instruction for a language model is generated based on the most recent message, the conversational snippet, and an answer-format instruction. The language model is directed to produce an answer output, which is presented via a user interface. An answer-augmentation instruction for the language model is generated based on the answer output, similar messages sent by the user based on string similarity with the answer output, and an augmented-answer format instruction. The language model is directed to produce an augmented-answer output, which is presented via the user interface.
Prompt caching in generative response engines
Disclosed are systems, apparatuses, processes, and computer-readable media for caching prompts for a generative response engine. The present technology includes receiving, by a cloud computing service, a request including a first prompt including a natural language task to perform, wherein the request include an access key for accessing the cloud computing service; identifying a generative response engine for generating a response to the natural language task based on contents of the natural language task; transmitting the first prompt and a hash to the generative response engine; and receiving the response to the natural language task from the generative response engine, the response including a number of input tokens.
Model Based API Mocking
The present technology, roughly described, provides for mocking an application program interface (API) using a large language model (LLM). The present system generates a prompt with API signature information and API desired behavior information. The prompt can include instructions, library functions, examples, executed programs, and a current function invocation, as well as other content. The prompt can be generated, and submitted to an LLM to mock an API and generate a response. The response can be audited and the LLM can be fine tuned to provide improved performance in subsequent calls.
SYSTEMS AND METHODS FOR IMPROVED DATA PROCESSING OF COMMUNICATIONS ACROSS COMPUTER NETWORKS USING TRIFURCATED PROMPTS
Systems and methods for improved data processing of communications across computer networks using trifurcated prompts during communication exchanges are described. For example, the system may receive a first inbound communication, wherein the first inbound communication system may determine a first context for the first inbound communication based on the first text string. The system may process the first context in a perturbation model to determining a first perturbed context, wherein the perturbation model determines the first perturbed context by determining a first alternative token for a first token in the first context. The system may determine a first prompt for a first large language model based on the first perturbed context.
AUDIO AND VIDEO TOKENIZATION FOR MULTIMODAL LARGE LANGUAGE MODELS
Systems and methods for power-efficient, continuous tokenization and long-context storage of audio and video data for use with multimodal large language models (LLMs). The systems include specialized subsystems configured to receive input signals, generate discrete tokens representing the input, and buffer the tokens for durations ranging from seconds to hours. Upon receiving a trigger to initiate communication with a multimodal LLM, at least a subset of the buffered tokens is transmitted to an inference dispatcher, which determines the distribution of the tokens to one or more inference engines for processing. The architecture supports tokenization and buffering for multiple modalities, including audio, video, image, and text, and enables context-rich, privacy-preserving, and low-latency AI interactions on client devices. By utilizing efficient token-based data encoding and performing the tokenization at low-power hardware, power consumption and bandwidth usage are significantly reduced, thereby allowing seamless, always-on multimodal AI experiences on battery-powered platforms.
CONVERSATION METHODS, ELECTRONIC DEVICES, STORAGE MEDIA, AND PRODUCTS
The disclosure relates to a conversation method, an electronic device, a storage medium, and a product, which relates to the field of computer technology. The conversation method includes: displaying a conversation between a user and a first agent; generating setting information for a second agent to be created based on the conversation; creating the second agent according to the setting information, wherein the second agent is configured to participate in the conversation between the user and the first agent based on the setting information for the second agent; and displaying the conversation among the user, the first agent and the second agent.
INFORMATION PROCESSING APPARATUS, STORAGE MEDIUM, AND INFORMATION PROCESSING METHOD
An information processing apparatus includes a receiving unit configured to receive a prompt from a user, the received prompt being natural language, an acquisition unit configured to acquire, from a cloud service, device information regarding devices corresponding to an organization to which the user belongs, and a display unit configured to display an answer generated by a language model based on the received prompt and the acquired device information on a display section.