SYSTEMS AND METHODS FOR IMPROVED TRAINING OF MACHINE LEARNING MODELS
20230334835 · 2023-10-19
Inventors
Cpc classification
G06V10/771
PHYSICS
International classification
Abstract
Systems and methods applicable, for instance, to training machine learning models (MLMs). Training of a multihead classifier MLM can utilize a two term loss function. A first term of the loss function can be used to reward each of the heads for the extent to which it properly predicts labels of the labeled training data instances. A second term of the loss function can reward each of the heads for the extent to which it disagrees with each of the other heads in terms of predicting labels. As such, the MLM can both predict proper labels for the labeled training data instances and be distinct on the unlabeled instances.
Claims
1. A computer-implemented method, comprising: providing, by a computing system, to a machine learning model, one or more labeled training data instances; receiving, by the computing system, from the machine learning model, generated output, wherein the generated output comprises predicted labels for the labeled training data instances; determining, by the computing system, a first loss function term, wherein the first loss function term rewards each of multiple elements of the machine learning model for the extent to which it properly predicts labels of the labeled training data instances; providing, by the computing system, to the machine learning model, one or more unlabeled training data instances; receiving, by the computing system, from the machine learning model, generated output, wherein the generated output comprises predicted labels for the unlabeled training data instances; and determining, by the computing system, a second loss function term, wherein the second loss function term rewards each of the multiple elements of the machine learning model for the extent to which it disagrees with each of other elements of the machine learning model in predicting labels for a subset of the unlabeled training data instances.
2. The computer-implemented method of claim 1, further comprising: training, by the computing system, the machine learning model using the first loss function term and the second loss function term.
3. The computer-implemented method of claim 1, wherein the multiple elements of the machine learning model are one or more of heads of the machine learning model, or ensemble elements of the machine learning model.
4. The computer-implemented method of claim 1, further comprising: selecting, by the computing system, the subset as a quantity of the unlabeled training data instances for which there is maximal disagreement among elements of the machine learning model.
5. The computer-implemented method of claim 1, wherein said predicted labels comprise class labels, reward labels, or recommendation labels.
6. The computer-implemented method of claim 1, wherein the multiple elements of the machine learning model are trained to predict proper labels for the labeled training data instances and to be distinct on the unlabeled training data instances.
7. The computer-implemented method of claim 1, wherein the machine learning model is one of a transformer encoder-based classifier, convolutional neural network-based classifier, or an autoencoder-based machine learning model.
8. The computer-implemented method of claim 1, wherein the machine learning model is a critic machine learning model, and wherein the critic machine learning model generates reward output for a second machine learning model.
9. The computer-implemented method of claim 8, wherein the second machine learning model is one of a transformer decoder-based generative machine learning model, long term short memory-based generative machine learning model, or a convolutional neural network-based generative machine learning model.
10. The computer-implemented method of claim 1, wherein said training data instances comprise one or more of sentences, images, or images superimposed with text.
11. A system, comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform: providing, to a machine learning model, one or more labeled training data instances; receiving, from the machine learning model, generated output, wherein the generated output comprises predicted labels for the labeled training data instances; determining a first loss function term, wherein the first loss function term rewards each of multiple elements of the machine learning model for the extent to which it properly predicts labels of the labeled training data instances; providing, to the machine learning model, one or more unlabeled training data instances; receiving, from the machine learning model, generated output, wherein the generated output comprises predicted labels for the unlabeled training data instances; and determining a second loss function term, wherein the second loss function term rewards each of the multiple elements of the machine learning model for the extent to which it disagrees with each of other elements of the machine learning model in predicting labels for a subset of the unlabeled training data instances.
12. The system of claim 11, wherein the instructions, when executed by the at least one processor, further cause the system to perform: training the machine learning model using the first loss function term and the second loss function term.
13. The system of claim 11, wherein the instructions, when executed by the at least one processor, further cause the system to perform: selecting the subset as a quantity of the unlabeled training data instances for which there is maximal disagreement among elements of the machine learning model.
14. The system of claim 11, wherein the machine learning model is one of a transformer encoder-based classifier, convolutional neural network-based classifier, or an autoencoder-based machine learning model.
15. The system of claim 11, wherein the machine learning model is a critic machine learning model, and wherein the critic machine learning model generates reward output for a second machine learning model.
16. A non-transitory computer-readable storage medium including instructions that, when executed by at least one processor of a computing system, cause the computing system to perform a method comprising: providing, to a machine learning model, one or more labeled training data instances; receiving, from the machine learning model, generated output, wherein the generated output comprises predicted labels for the labeled training data instances; determining a first loss function term, wherein the first loss function term rewards each of multiple elements of the machine learning model for the extent to which it properly predicts labels of the labeled training data instances; providing, to the machine learning model, one or more unlabeled training data instances; receiving, from the machine learning model, generated output, wherein the generated output comprises predicted labels for the unlabeled training data instances; and determining a second loss function term, wherein the second loss function term rewards each of the multiple elements of the machine learning model for the extent to which it disagrees with each of other elements of the machine learning model in predicting labels for a subset of the unlabeled training data instances.
17. The non-transitory computer-readable storage medium of claim 16, wherein the instructions, when executed by the at least one processor of the computing system, further cause the computing system to perform: training the machine learning model using the first loss function term and the second loss function term.
18. The non-transitory computer-readable storage medium of claim 16, wherein the instructions, when executed by the at least one processor of the computing system, further cause the computing system to perform: selecting the subset as a quantity of the unlabeled training data instances for which there is maximal disagreement among elements of the machine learning model.
19. The non-transitory computer-readable storage medium of claim 16, wherein the machine learning model is one of a transformer encoder-based classifier, convolutional neural network-based classifier, or an autoencoder-based machine learning model.
20. The non-transitory computer-readable storage medium of claim 16, wherein the machine learning model is a critic machine learning model, and wherein the critic machine learning model generates reward output for a second machine learning model.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0012]
[0013]
[0014]
[0015]
[0016]
[0017]
[0018]
DETAILED DESCRIPTION
[0019] Turning to
[0020] The generative module 105/RL module can, as an example, include a transformer decoder-based generative MLM. As another example, the generative module 105/RL module can include a long term short memory (LSTM)-based generative MLM. As yet another example, the generative/RL module 105 can include a CNN-based generative MLM. An MLM of the generative/RL module 105 can be an RL-based MLM that takes various actions in pursuit of maximizing reward. As just an illustration, such an RL-based MLM can take the action of selecting news stories in view of a reward indicating whether given selected stories have caused viewing users to be happy.
[0021] The multihead classifier/critic module 103 can act in a critic role in training the generative/RL module 105. As an example, where the generative/RL module 105 includes a MLM that generates sentences, the multihead classifier/critic module 103 can act in a critic role by outputting labels that specify whether or not sentences generated by the generative/RL module 105 are coherent and convincing. As another example, where the generative/RL module 105 includes a MLM that selects news stories, the multihead classifier/critic module 103 can act in a critic role by outputting labels that specify whether or not users who have viewed the selected stories are happy.
[0022] The user access module 107 can allow access to output of the generative/RL module 105 (e.g., generated sentences generated images, or selected news stories). Further, the training module 109 can train one or more MLMs of the multihead classifier/critic module 103 to generate accurate output labels (e.g., labels indicating classifications) for given inputs, even where such inputs are OOD.
[0023] The training of the one or more MLMs of the multihead classifier/critic module 103 by the training module 109 can cause the MLMs to learn multiple functions/algorithms A.sub.0, . . . , A.sub.n that: a) correctly predict labels for labeled instances L of a set of training data; but b) are distinct on unlabeled instances U of the set of training data insofar as using different features to generate labels.
[0024] As additional examples, in the case of learned functions/algorithms A.sub.i and A.sub.j regarding reward functions, such distinctness on the unlabeled instances U can regard the function/algorithm A.sub.i and the function/algorithm A.sub.j being distinct on a given member of U: a) if A.sub.i and A.sub.j use different features to generate different reward values for that member of U; and/or b) if A.sub.i and A.sub.j use different features so as to have different optimal actions for that member of U. As still further examples, in the case of learned functions/algorithms A.sub.i and A.sub.j outputting recommender system lists, such distinctness on the unlabeled instances U can regard the function/algorithm A.sub.i and the function/algorithm A.sub.j being distinct on a given member of U: a) based on using different features so as to give different rankings for that member of U; and/or b) using different features so as to exhibit a different extent of absence of overlap in the rankings that they give for that member of U. In some embodiments, distinctness on the unlabeled instances U can regard a learned function/algorithm A.sub.i and a learned function/algorithm A.sub.j being distinct on a given member of U if the inner products and/or L1 norms of the probability distributions of the functions/algorithms for that member of U differ due to A.sub.i and A.sub.j using different features.
[0025] Relevantly, distinctness can regard a measure of distinctness between any two learned functions/algorithms A.sub.i and A.sub.j with respect to any single member of U. According to the functionality set forth herein, this framing of distinctness can be utilized to train MLMs in a more effective way via the beneficial approach of bootstrapping ambiguous data (e.g., one or more datapoints determined to be the most ambiguous). With this approach it is recognized that when A.sub.i and A.sub.j differ as to their predictions for a given member of U, such ambiguous datapoint can be expected to exhibit features that cause the disagreement. As such, the MLM training approaches discussed herein include amplifying (e.g., by stochastic gradient descent (SGD)) the difference on such an ambiguous datapoint, thereby causing A.sub.i and A.sub.j to use different features. Further still, according to various embodiments this amplification (e.g., by gradient descent) can be performed with respect to a subset of one or more U datapoints for which A.sub.i and A.sub.j disagree the most. As such, these embodiments beneficially allow merely a small number of ambiguous datapoints to be used. It is observed that according to the improved MLM training approaches discussed herein, there is not call to make assumptions regarding feature independence. Additionally, by way of the MLM training approaches discussed herein including having A.sub.i be correct on the labeled data L, training can yield a family of functions/algorithms that are the same on the labeled data L, but use different features on the unlabeled data U.
[0026] Thus, as just an illustration by way of MLM training approaches discussed herein, considering an example of a MLM that classifies pet images as being either cats or dogs, one function/algorithm can come to use features including spurious background-related features for distinguishing cat images from dog images, while another function/algorithm can come to use non-spurious/core animal-related features.
[0027] Turning to
[0028] Further still, the functionality of
[0029] Additionally taken as input by the functionality of
[0030] In operation, the functionality of
[0031] Then, for 215-219, the functionality of
[0032] For 217-219, the functionality of
[0033] In this way, the functionality of
[0034] Discussed above in connection with 217 has been selecting the data point s.sub.U with the minimal value of 12 (A.sub.i(s.sub.U), A.sub.j(s.sub.U)). However, other possibilities exist. For example, top-k sampling can be used in selecting the data point whose 12 (A.sub.i(s.sub.U), A.sub.j(s.sub.U)) value is utilized for the second total loss term L2.sub.ij (e.g., randomly selecting a data point from those k datapoints achieving the lowest 12 (A.sub.i(s.sub.U), A.sub.j(s.sub.U)) values). As another example, such top-k sampling can utilize weightings. Further, in various embodiments, multiple data points s.sub.U exhibiting minimal 12 (A.sub.i(s.sub.U), A.sub.j(s.sub.U)) values can be selected, with the second total loss term L2.sub.ij being the sum of 12 (A.sub.i(s.sub.U), A.sub.j(s.sub.U)) over those multiple data points s.sub.U.
[0035] It is observed that the various discussed approaches for s.sub.U data point selection in connection with determining the value for the second total loss term L2.sub.ij can, from one point of view, be seen as exhibiting the phenomenon of those data points with already low 12 (A.sub.i(s.sub.U), Aj(sU)) values playing a major role in driving A.sub.i and A.sub.j (e.g., a given pair of heads of the multihead classifier MLM of the classifier/critic module 103) apart, and those exhibiting a high 12 (A.sub.i(s.sub.U), A.sub.j(s.sub.U)) (and therefore having A.sub.i(s.sub.U) and A.sub.j(s.sub.U) close together) being essentially ignored.
[0036] Turning to
[0037] With reference to the approaches for MLM training discussed, for instance, in connection with
[0038] Via the training approaches discussed herein, the first term of the noted loss function can be used to reward each of head 301 and head 303 for the extent to which it properly predicts labels of the labeled training data instances (e.g., of labeled instances 305 and 307). Then, the second term of the loss function can be used to reward each of head 301 and head 303 for the extent to which it disagrees with the other head in terms of predicting labels for a selected subset of the unlabeled training data instances. With reference to that which is discussed above, this selected subset can include unlabeled training data instances for which there is maximal disagreement among heads 301 and 303 (e.g., the subset can include unlabeled instances 309 and 311).
[0039] Through application of the training approaches discussed herein, at least one of the heads 301 and 303 can properly use object-related features for distinguishing images, therefore allowing the multihead classifier MLM to correctly classify the military images. It is observed that the MLM training approaches discussed herein succeed with respect to this example MLM task despite the luminosity-related features being simpler than the object-related features.
[0040] Turning to
[0041] A further training set can, as discussed herein, be generated, the further training set including labeled instances and unlabeled instances. The labeled instances can include happy faces bearing the text “happy” and sad faces bearing the text “sad” selected from the available conventional training set, with the labels correctly specifying whether corresponding images are happy or sad faces. On the other hand, the unlabeled instances can include sad faces bearing the text “happy” and happy faces bearing the text “sad” as well as happy faces bearing the text “happy” and sad faces bearing the text “sad.”
[0042] Along the lines of the example of
[0043] At least one of the heads 401 and 403 can, by virtue of the training approaches discussed herein, properly use facial expression-related features for distinguishing images, therefore allowing the multihead classifier MLM to correctly classify the face images, including images where superimposed text and facial expression do not match (e.g., sad faces having the text “happy”). Similar to the example of
[0044] As an example, the two-head classifier MLM of
[0045] During training, the MLM of the generative/RL module 105 can both take action to select a news story and take action to instruct the software module as to which of “happy” and “sad” to superimpose on the captured image of a user to whom that news story was presented. The generative/RL module 105 can then pass the captured image with superimposed text to the multihead classifier/critic module 103 in order to receive a reward corresponding to its having selected that news story.
[0046] Were the MLM of the multihead classifier/critic module 103 not a multihead classifier according to the approaches discussed herein but rather a conventional single-head classifier, and were it trained according to conventional approaches on an underspecified dataset where all or many of the images depicting happy facial expressions have the text “happy” superimposed thereon, and where all or many of the images depicting sad facial expressions can have the text “sad” superimposed hereon, undesirable operation could ensue. In particular, under these circumstances both facial expression-related features and text-related features could seem a valid set of features for distinguishing a happy face from a sad face. Moreover, as the text-related features are simpler than the facial expression-related features, it can be expected that the MLM of the multihead classifier/critic module 103 would come to use the text-related features, according to conventional approaches. As such, according to this conventional functionality the MLM of the multihead classifier/critic module 103 would misclassify when presented with OOD images, therefore generating a happy label when presented with a sad face having the text “happy,” and generating a sad label when presented with a happy face having the text “sad.”
[0047] Further according to this example, according to conventional approaches a wireheading situation could arise. In particular, during training it could become apparent to the MLM of the generative/RL module 105 that taking action to have the software module superimpose “happy” on a captured image was sufficient to receive a high reward for a selected news story. Moreover, as such superimposing action is simpler than taking action to select a news story that leads to actual user happiness, it can be expected that the RL-based MLM of the generative/RL module 105 will come to adopt that superimposing action.
[0048] In contrast, where the MLM of the multihead classifier/critic module 103 is a multihead classifier trained according to the approaches discussed herein, at least one of the heads thereof would properly use facial expression-related features for distinguishing images, therefore allowing the multihead classifier/critic module 103 to correctly classify captured images of users even when inappropriate text was superimposed thereon (e.g., the word “happy” superimposed on an image of a sad user). As such, according to the approaches discussed herein the RL-based MLM of the generative/RL module 105 would, during training, not receive a high reward for taking action to superimpose the word “happy” on the captured image of a sad user. As such, a wireheading situation would not arise according to the approaches discussed herein.
[0049] Now turning to
[0050] According to the example of
[0051] Using the approaches discussed herein a further training set can be generated, the further training set including labeled instances and unlabeled instances. The labeled instances can include coherent/convincing sentences that do not exhibit the noted word repetition, with the corresponding labels indicating this coherency/convincingness. The labeled instances can also include incoherent/non-convincing sentences that exhibit the noted word repetition, with the labels indicating this lack of coherency/convincingness. Further, the unlabeled instances can include sentences that exhibit the noted word repetition yet nevertheless are convincing and coherent, and sentences that fail to exhibit the noted word repetition yet still are neither convincing nor coherent. Further still, the unlabeled instances can include coherent/convincing sentences that do not exhibit the noted word repetition and incoherent/non-convincing sentences that exhibit the noted word repetition.
[0052] Along the lines of the examples of
[0053] At least one of the heads 501 and 503 can, by virtue of the training approaches discussed herein, properly use features broadly and accurately capturing the concept of sentence coherence/convincingness for distinguishing sentences, therefore allowing the multihead classifier MLM to correctly classify the sentences, including: a) sentences that exhibit word repetition yet nevertheless are convincing and coherent; and b) sentences that fail to exhibit word repetition yet still are neither convincing nor coherent Similar to the example of
[0054] As an example, the two-head classifier MLM of
[0055] Were the MLM of the multihead classifier/critic module 103 not a multihead classifier according to the approaches discussed herein but rather a conventional single-head classifier, and were it trained according to conventional approaches on an underspecified dataset where all or many of the coherent and convincing sentences do not exhibit word repetition, and where all or many of the sentences that are neither convincing nor coherent do exhibit word repetition, undesirable operation could ensue. In particular, under these circumstances both features relating to word repetition and features more broadly and accurately capturing the concept of sentence coherence/convincingness could seem a valid set of features for distinguishing a sentence that exhibited coherence/convincingness from one which did not. Moreover, as the features relating to word repetition are simpler than the features broadly and accurately capturing the concept of sentence coherence/convincingness, it can be expected that the MLM of the multihead classifier/critic module 103 would come to use the word repetition features, according to conventional approaches. As such, according to this conventional functionality the MLM of the multihead classifier/critic module 103 would misclassify when presented with OOD images, therefore generating a label indicating coherence/convincingness when presented with an incoherent/non-convincing sentence not exhibiting word repetition, and generating a label indicating lack of coherence/convincingness when presented with a coherent/convincing sentence exhibiting word repetition.
[0056] Further according to this example, according to conventional approaches a wireheading situation could arise. In particular, during training it could become apparent to the MLM of the generative/RL module 105 that taking action to generate sentences that do not exhibit word repetition was sufficient to receive high rewards for sentence generation. Moreover, as such a course of action is simpler than taking action to generate sentences that are actually convincing and coherent (e.g., sentences exhibiting the noted features broadly and accurately capturing the concept of sentence coherence/convincingness), it can be expected that the RL-based MLM of the generative/RL module 105 would adopt this simpler course of action.
[0057] In contrast, where the MLM of the multihead classifier/critic module 103 is a multihead classifier trained according to the approaches discussed herein, at least one of the heads thereof would properly use features broadly and accurately capturing the concept of sentence coherence/convincingness for distinguishing sentences, therefore allowing the multihead classifier/critic module 103 to correctly classify even OOD sentences. As such, according to the approaches discussed herein the RL-based MLM of the generative/RL module 105 would, during training, not receive a high reward for taking action to merely generate sentences that do not exhibit word repetition (as opposed to sentences that are actually convincing and coherent). As such, a wireheading situation would not arise according to the approaches discussed herein.
[0058] Now turning to
[0059] Utilizing the approaches discussed herein a further training set can be generated, the further training set including labeled instances and unlabeled instances. The labeled instances can include hate speech sentences that include both religion-referencing language and hatred-expressing language, with the corresponding labels indicating these sentences to be hate speech sentences. The labeled instances can also include non-hate speech sentences that include neither religion-referencing language nor hatred-expressing language, with the corresponding labels indicating these sentences to be non-hate speech sentences. Further, the unlabeled instances can include sentences that include the noted religion-referencing language yet nevertheless are non-hate speech sentences, and sentences that fail to exhibit the noted religion-referencing language yet still are hate speech sentences. Further still, the unlabeled instances can include sentences that include the noted religion-referencing language and are hate speech sentences, and sentences that do not include the noted religion-referencing language and are non-hate speech sentences.
[0060] Along the lines of the examples of
[0061] By virtue of the training approaches discussed herein, at least one of the heads 601 and 603 can come to properly use features concerning hatred-expressing language for distinguishing sentences. As such, the multihead classifier MLM can come to correctly classify sentences, including: a) sentences that include religion-referencing language yet nevertheless are non-hate speech sentences; and b) sentences that fail to include religion-referencing language yet still are hate speech sentences. The MLM training approaches discussed herein can succeed even where the features concerning religion-referencing language are simpler than the features concerning hatred-expressing language.
[0062] Discussed herein have been training approaches that utilize both labeled instances and unlabeled instances of a set of training data. However, other possibilities exist. For example, the training approaches discussed herein can be implemented in a modified manner that utilizes only labeled instances, without there being call to also use unlabeled instances.
[0063] As an example of such a modified approach, multiple labeled instances (e.g., a batch of labeled instances) can be provided as input to a multihead classifier MLM that has n heads. When a given instance (e.g., an image or a sentence) i of the multiple instances is passed through the classifier, each of the n heads can map the instance i to a classification. In this way, the MLM can generate n classifications for the instance i. The training of the multihead classifier MLM can include consideration of: a) the accuracy of the n classifications; and b) the ambiguity of the n classifications.
[0064] In particular, for a given instance i the accuracy can correspond to the average distance of those n classifications from a label for that instance i according to the training set. The ambiguity for a given instance i can correspond to how much the n classifications differ for that instance i. In this way, the noted operations of passage to the MLM, consideration of accuracy, and consideration of ambiguity can be performed for each of the multiple instances i.
[0065] The training of the multihead classifier MLM can further include selecting: a) the C instances i that yielded the highest accuracy results; and b) the B instances i that yielded the highest ambiguity results. The quantities for C and B can be tunable hyperparameters. Further still, the training of the multihead classifier MLM can include: a) utilizing SGD to encourage the n heads to more accurately generate labels for the selected accurate C instances; and b) utilizing SGD to encourage the n heads to more distinctly generate labels for the selected ambiguous instances B.
[0066] As such, according to the foregoing the training approaches discussed herein can be modified so as to be able to use only labeled instances, without call to also use unlabeled instances.
[0067] As referenced above, approaches discussed herein can be applied in RL contexts. According to various embodiments, implementation thereof can include utilizing loss functions that cause RL reward functions to be distinct with respect to unlabeled data. As just some examples, such loss functions can incorporate criteria such as: a) requiring that reward functions have different optimal actions; b) requiring that reward functions have different value functions; and c) requiring that reward functions be as numerically distinct from one another as possible.
[0068] Further still, in various embodiments functionality discussed herein can be implemented in a fashion that utilizes reinforcement learning from human feedback (RLHF). According to these embodiments, labeled data as discussed herein can be provided using annotated human-sourced data. Further according to these embodiments, unlabeled data as discussed herein can be provided via new data. As just some examples, this new data can be generated by humans, be generated by MLMs, or be generated using both humans and MLMs. Where there is implementation of the just-discussed approaches of utilizing loss functions that cause RL reward functions to be distinct with respect to unlabeled data, the RL reward functions can be distinct on this new unlabeled data.
[0069] It is noted that the approaches discussed herein have wide applicability, and are not limited to the various examples discussed herein. As just some examples, the approaches discussed herein can be utilized in connection with applications including (but not limited to) chatbots, content moderation, prompt-specification, and action-driven AI.
Hardware and Software
[0070] According to various embodiments, various functionality discussed herein can be performed by and/or with the help of one or more computers. Such a computer can be and/or incorporate, as just some examples, a personal computer, a server, a smartphone, a system-on-a-chip, and/or a microcontroller. Such a computer can, in various embodiments, run Linux, MacOS, Windows, or another operating system.
[0071] Such a computer can also be and/or incorporate one or more processors operatively connected to one or more memory or storage units, wherein the memory or storage may contain data, algorithms, and/or program code, and the processor or processors may execute the program code and/or manipulate the program code, data, and/or algorithms. Shown in
[0072] In accordance with various embodiments of the present invention, a computer may run one or more software modules designed to perform one or more of the above-described operations. Such modules can, for example, be programmed using Python, Java, JavaScript, Swift, C, C++, C#, and/or another language. Corresponding program code can be placed on media such as, for example, DVD, CD-ROM, memory card, and/or floppy disk. It is noted that any indicated division of operations among particular software modules is for purposes of illustration, and that alternate divisions of operation may be employed. Accordingly, any operations indicated as being performed by one software module can instead be performed by a plurality of software modules. Similarly, any operations indicated as being performed by a plurality of modules can instead be performed by a single module. It is noted that operations indicated as being performed by a particular computer can instead be performed by a plurality of computers. It is further noted that, in various embodiments, peer-to-peer and/or grid computing techniques may be employed. It is additionally noted that, in various embodiments, remote communication among software modules may occur. Such remote communication can, for example, involve JavaScript Object Notation-Remote Procedure Call (JSON-RPC), Simple Object Access Protocol (SOAP), Java Messaging Service (JMS), Remote Method Invocation (RMI), Remote Procedure Call (RPC), sockets, and/or pipes.
[0073] Moreover, in various embodiments the functionality discussed herein can be implemented using special-purpose circuitry, such as via one or more integrated circuits, Application Specific Integrated Circuits (ASICs), or Field Programmable Gate Arrays (FPGAs). A Hardware Description Language (HDL) can, in various embodiments, be employed in instantiating the functionality discussed herein. Such an HDL can, as just some examples, be Verilog or Very High Speed Integrated Circuit Hardware Description Language (VHDL). More generally, various embodiments can be implemented using hardwired circuitry without or without software instructions. As such, the functionality discussed herein is limited neither to any specific combination of hardware circuitry and software, nor to any particular source for the instructions executed by the data processing system.