Method of animating messages
09824479 · 2017-11-21
Inventors
Cpc classification
International classification
Abstract
The present invention relates to rendering texts in a natural language, namely to manipulating a text in a natural language to generate an image or animation corresponding to this text. The invention is unique mainly in that a sequence of animations is selected, semantically corresponding to a given text. Given a set of animations and a text, the invention makes it possible to compare the sequence of these animations to this text. It is unique in that text templates are used and an optimum sequence of these templates is determined. The idea of the template-based text rendering consists in that the text is manipulated to generate an image or animation with the aid of searching correspondences to a limited number of predefined templates. An animation according to certain style is selected in compliance with each template. Animations are sequentially combined into a single sequence of video images.
Claims
1. A method for animating messages in a natural language, said method comprising the steps of: providing a natural language text message to be animated; dividing said text into sentences; dividing said sentences into words; reducing each word to a normalized form; selecting a sequence of templates for a normalized text; combining the sequences of templates for each sentence in a certain order into a single common sequence of templates; selecting an animation for each template; combining the selected animation files into a resulting clip, wherein said combining the sequences of templates for each sentence in a certain order into a single common sequence of templates comprises the steps of: (a) selecting from a list all templates having all template words included in a sentence; (b) deriving from the list of templates obtained in the previous step (a) information regarding hierarchy of the templates, the template hierarchy level determining a template rank; (c) excluding the templates with a minimum rank from the list of templates obtained in the step (a) if templates belonging to the same hierarchy but having different levels are included in a single set of templates obtained in the step (a); (d) selecting from the list of templates obtained in the step (c), an optimum set of templates for which a target function has the maximum value, wherein the target function is given by:
2. The method of claim 1, wherein words, phrases and symbols are formally replaced with predefined words, phrases or symbols equivalent thereto.
3. The method of claim 1, wherein misprints are corrected in the words.
4. The method of claim 1, wherein neutral background animations for each template are selected.
5. The method of claim 1, further determining whether any sequences of templates corresponding to the normalized text are available in the cache, and selecting same from the cache if available.
6. The method of claim 1, further determining whether an animation style is preset for the text provided and performing either of the following: selecting animations corresponding to the style which is of each template from the sequence is available in this style; or selecting animations from a randomly selected style for which all animations are available for the selected templates if no style is preset; or selecting animations from a randomly selected style if there is a number of such styles for which all animations are available for the selected templates, any style is randomly selected; or selecting animations from various styles if no styles are available for which all animations are available for the selected templates.
7. The method of claim 1, further comprising saving in a database an original text of request and the selected sequence of templates; and updating statistics for templates, words, animations, the list of unknown words and statistics for unknown words.
8. The method of claim 1, wherein the maximum value of the target function is searched either by means of an exhaustive search of the sequence of templates (successively going through all combinations of unique templates) or by means of multicriteria optimization.
9. A method for animating messages in a natural language, said method comprising the steps of: providing a natural language text message to be animated; dividing said text into sentences; dividing said sentences into words; reducing each word to a normalized form; selecting a sequence of templates for a normalized text; combining the sequences of templates for each sentence in a certain order into a single common sequence of templates; selecting an animation for each template; combining the selected animation files into a resulting clip, wherein said combining the sequences of templates for each sentence in a certain order into a single common sequence of templates comprises the steps of: finding all templates all words of which are found in the phrase being sought by using the Levenshtein edit-distance algorithm; sorting the found templates by the number of words in ascending order and sequentially determining for each template from this formed set of templates, whether the template is a part of (included in) any other template from the set, and deleting the included templates; in the resulting set of templates, defining the templates non-crossing each other and crossing each other; adding the non-crossing templates to a resulting set; calculating for each template an average value of the template word positions in the text; and sorting the templates by that average value to obtain an optimum sequence of templates, and wherein if no non-crossing templates exist, an optimum set of templates is formed from the crossing templates by carrying out the steps of: (aa) selecting a first template with a maximum weight, said template weight being the number of words in the template; (bb) selecting templates minimally crossing the first template and non-crossing each other; (cc) selecting from the remaining templates those templates that contain >=50% of new words; wherein (d) if in the step (aa) a few templates with a maximum weight are available, selecting as the resulting set of templates the set having a difference between the aggregate rank of templates and the number of word crosses which is larger than same for other sets.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The above features and advantages of the present invention will be more clearly understood from the ensuing description accompanied by the drawings, in which
(2)
(3)
DETAILED DESCRIPTION OF THE INVENTION
(4) The animation selection algorithm is implemented by an animation system 10 (
(5) The web service comprises a means 14 implementing the animation selection and splicing algorithm working with an animation database 16. Set as text parameters of the message to be animated is a style which the animation has to be selected in (optionally). The algorithm generates a sequence of bytes, i.e., an assembled binary SWF file. The procedure in the animation system 10 is under control of an application 18 providing access to service. Performed particularly, are access to service authentication 20, reception 22 of a text animation request, sending 24 a resulting animation, accumulation 26 of statistics, and content support 28. Also conducted (in 30) is statistical recording for analysis and billing.
(6) The algorithm (method) is unique mainly in that a sequence of animations is selected, semantically corresponding to a specified text. Given a set of animations and a text, the algorithm makes it possible to compare the sequence of these animations to this text. It is unique in that text templates are used, an optimum sequence of these templates is determined, and a style concept is used.
Mode of Operation of the Web Service Performing Text Animation
(7) The idea of the template-based text visualization consists in that the text is manipulated to generate an image or animation with the aid of searching correspondences to a limited number of predefined templates. An animation according to certain style is selected in compliance with each template. Animations are sequentially combined into a single sequence of video images. The text animation algorithms are comprised in the web service developed for solving these tasks. All metadata used in the process of animation (templates, animations, styles, etc.) are stored in the database with which the web service operates.
(8) The animation being generated is a template-based one. Therefore, it is a main object of the service to split the text to be animated into semantically close templates according to a certain sequence, to select the respective animation for each template and to splice the animations into a single clip. The animation may be styled someway (smiley, Disney, Winnie-the-Pooh, Bidstrup, anti-animation, etc.). For testing and deploying the animation service, it is “wrapped” into a web interface whose main object is to generate animations, to administer templates, animations, to accumulate statistics, etc.
(9) In order to determine the structure of the web service, a number of definitions will be introduced.
(10) Style. This is a visual style to satisfy to at the output. For example, smiley, cartoons, serious. There is at least one default style.
(11) Templates. Templates are a set of phrases in the form of “I #to love you”, “Hi”, “To kiss you”. Each template is stored as a selection of normalized words. It means that “To kiss you” is stored instead of “I kiss you”. The template is normalized in the step of adding via the web interface. Templates are case-insensitive. The number of interword spaces and word separation by punctuation are ignored.
(12) Group. Groups are preceded by a tag #. A group is a generalization of a number of symbols. For example, #love—love, adore, in love, wild about. Thus, the group #love aggregates the words “adore”, “love” and others. The groups make it possible to faster define similar templates. A template not comprising a group has a higher priority in terms of selecting a template.
(13) Animation. A number of animations correspond to each template. Animation has to be specified with reference to the style. A number of animations may exist in the same style. Then, one of them is selected. For each template established in the system, at least one animation in the “on default” style exists unless other style is explicitly established. Template animation—a content file in the swf format (the format may be both vector and bit mapped). The animation has a “background animation” feature defining that the animation is a background one. Non-background animations comprise a transparent background. Some background animations suitable for demonstrating arbitrary foreground animations are neutral background.
(14) Misprints. Misprints may occur in the text to be transmitted. Misprints are corrected by means of the misprinted word=>correct word relationship set in store.
(15) User. A user is the one who makes an animation request by sending a text. It may be a service, web application, etc. The user is entitled to use the service within certain time limits. The user has the authentication data stored in the database. The database also stores the information regarding the time for access availability to the service. A set of allowed animation styles may be associated with the user.
(16) In addition to processing the text animation requests, the web service accumulates statistics (
Text-Compliant Animation Selection Algorithm
(17) Present at the input of the algorithm are the text and style wherein the respective animations have to be selected (optionally). This data is sent to web service.
(18) 1. The input text is split into sentences. Splitting is performed with the aid of standard punctuation separators (“full stop”, “interrogation mark”, “exclamation mark”, “ellipsis”). Splitting is performed by means of a normal search of separators.
(19) 2. A sentence is split into parts based on word separators (splitting is performed by means of a normal search). Separators include “full stop”, “comma”, “space”, “colon”, “dash”, “semicolon”. Space, carriage return, tabulation characters are cut off from the resulting words at their beginning and end.
(20) 3. If all words in the text are English words, transliteration is performed. This is done by means of a symbol-by-symbol comparison of the sequences of English letters to phonetically close sequences of Russian letters (simple substitution table); if the algorithm is used for an English-language or any other content, transliteration may be omitted.
(21) 4. Before subsequent normalization, textual substitutions are made according to the substitution dictionary. It allows the words, word combinations and sentences as well as special characters to be substituted with other characters corresponding to more impressive visualization.
(22) 5. Each word is reduced to a normalized form (stemmer). The selection of a particular stemmer is of no special importance. ALP-supported (automated language processing, http://www.aot.ru/) Lucene.NET (http://www.dotlucene.net/) stemmer may be used as an example for normalizing a Russian-language text;
(23) 6. The words are checked for standard misprints to replace the misprinted words with the correct ones. A general dictionary of the words (normal forms) included in all sets of templates is maintained and stored in the DB. There are two methods of search and replace:
(24) a simple replacement method according to a dictionary of misprints, containing correct word meanings from a general dictionary of standard misprints;
(25) a method of searching a correct word with a minimum Levenshtein distance to the incorrect word being sought for subsequent replacement. Correction of misprints is an optional algorithm step. Correction of misprints may be also performed before the words are normalized.
(26) 7. The words that are part of the groups (of the generalization of synonyms) are replaced with the group names. As a group name, a synonym is usually selected. However, the group name may be arbitrary in general.
(27) 8. It is checked whether the cache contains the sequence of templates corresponding to the normalized misprint-free text obtained in step 6. If so, a sequence of templates is selected from the cache and the process goes to step 11. For the purpose of the present invention, cache is understood as a set of pairs (corrected normalized text; sequence of templates) stored in the random-access memory (with very short access time), rather than in the database;
(28) 9. If no text to be animated is available in the cache, a sequence of templates is selected for the normalized text using the algorithm for selection of templates corresponding to the animation; the sequence of templates is defined by the algorithm for selection of the sequence of templates corresponding to the animation.
(29) 10. The resulting sequence of templates is added to the cache by associating it with the text being sought.
(30) 11. Steps 2-10 are repeated for each sentence.
(31) 12. The sequences of templates for each sentence are combined in a certain order into a single common sequence of templates.
(32) 13. If a style is set, animations are selected corresponding to the style provided that for the selected style, an animation of each template from the sequence is available in this style. If no style is set, a style is randomly selected for which all animations are available for the selected templates. If there are a number of such styles, any style is randomly selected. If no styles are available, a selection of animations in various styles is used.
(33) 14. An animation is selected for each template in the selected style. If a single template has more than one corresponding animations in one style, a random animation is selected. Animations may be divided (optionally) into foreground and background animations. There are a number of options for “splicing” the animated clips selected into a single one based on the sequence of templates:
(34) sequential “splicing” of animated files (clips) into a single one, or
(35) “splicing” of foreground animations at the first or last background animation in the sequence order of “splicing”, or
(36) selecting a random neutral background for displaying foreground animations thereat (background animations are displayed as they are).
(37) All animated clips are scaled according to a maximum duration of clip from the sequence by centering the same (scaling and positioning algorithms may differ). In addition to the resulting clip, sound may be superimposed in the form of an arbitrary audio composition.
(38) 15. The “spliced” video clip is converted into an arbitrary, vector or bit mapped, format. For the initial swf animations, it may be, for example, a swf or MPEG file. The type of the output format of the resulting animated file is of no importance.
(39) 16. The original text of request, the selected sequence of templates are saved in the database; statistics for templates, words, animations, the list of unknown words and statistics for unknown words are updated.
Algorithm for the Selection of Templates Corresponding to the Animation
(40) The algorithm is used to select animations corresponding to a text.
(41) Algorithm input data: list of templates (is cached upon starting the web service), normalized text of the sentence without misprints. The list of templates is stored in the database in the normalized form.
(42) 1. In order to determine the sequence of templates, all templates having all template words included in a sentence are first selected from the list. If a particular animation style is set, the templates are selected from the list of templates having animations in the selected style rather than from the list of all templates.
(43) 2. Information regarding the hierarchy of templates is derived from the list of templates. Template A is said to be higher than template B in the hierarchy, if all template A words are included in template B. The hierarchy level determines the template rank.
(44) 3. If the templates belonging to the same hierarchy but having different levels are included in a single set of templates obtained in step 1, the templates with a minimum rank (a higher level) are deleted from this set.
(45) 4. From the list of templates obtained in step 2, an optimum set of templates is selected, for which a target function will have the optimum value. The target function is given by:
(46)
(47) where:
(48) N.sub.crosses is a number of crossings of the template words in the set of templates (for a set of templates “John walked. John ran. Rabbit ran. Ran.” the number is 3),
(49) N.sub.coverage is an aggregate coverage of all words by the templates (the number of all words from a sentence, encountered in the templates),
(50) N.sub.rank is an aggregate rank of all templates from the set,
(51) N.sub.pairs in correct sequence is a number of word pairs of the composite (i.e., consisting of a few words) templates corresponding to the sequence of words in a phrase,
(52) N.sub.pairs in incorrect sequence is a number of word pairs of the composite templates not corresponding to the sequence of words in a phrase, and
(53) k.sub.1, k.sub.2, k.sub.3, k.sub.4 are empirically calculated coefficients. They have a value of 0.4, 0.33, 0.4, 0.2, respectively.
(54) The value of function depends on a number of mathematical criteria. The function optimum (its maximum value) may be found in a number of ways:
(55) 1. by numerical techniques of multicriteria optimization;
(56) 2. by means of an exhaustive search of all sets of templates to calculate the target function value for each template.
Simplified Algorithm for Determining the Sequence of Templates Corresponding to the Animation
(57) A simplified and faster version of the algorithm for determining the sequence of templates corresponding to the animation is as follows:
(58) 1. Using the BlockDistance/Levenshtein edit-distance/Jaro-Winkler distance/Damerau-Levenshtein distance algorithm, all templates are identified wherein all words are in the phrase being sought.
(59) 2. The identified templates are sorted in ascending order by the number of words. Sequentially running through this formed set of templates, it is checked whether a template is a part of another template from the set. If so, it is discarded. Thus, a list of templates without enclosures is obtained.
(60) 3. In the resulting set of templates, the templates non-crossing each other and crossing each other are defined.
(61) 4. (Optional) Determined in the crossing templates is the rate of crossing (number of word crosses) with the words from the set of non-crossing templates and the number of new words covering the phrase, which are not included in this set.
(62) 5. Non-crossing templates are added to the resulting set (these are to be definitely included).
(63) 6. (Optional) If no non-crossing templates exist, the process goes to step 7. From the remaining crossing templates, a set is attempted to be formed, being optimum by the following criteria: maximum aggregate rank, maximum coverage, minimum crosses. This is done as follows:
(64) 6.1 a first template is used with a maximum weight (weight=number of words);
(65) 6.2 templates are then used minimally crossing the first template and not crossing each other;
(66) 6.3 from the remaining templates, those templates are further selected which contain >=50% of new words: templates are sorted by the number of new words and are added one at a time. In doing so, a test for word novelty is carried out considering the templates being added in the process (i.e., if the first template was added with a number of new words being over 50%, the second template may not be added since considering the words of the first added template the number of new words therein will be less).
(67) 6.4 If there are several templates with a maximum rank in step 6.1, then for the set obtained in step 6.3, a value is calculated as follows: aggregate rank of templates minus number of word crosses. Then, the next template with a maximum rank is used and steps 6.2-6.4 are repeated.
(68) 6.5. A set of templates is selected with a maximum value calculated in step 6.4.
(69) 7. The resulting optimum set of templates has to be transformed into a sequence. To this end, an average value of the template word positions in the text is calculated for each template, and templates are sorted by this average value. The obtained sequence comprises the desired result.
(70) The simplified version of the algorithm may be complicated by adding further criteria, from the complete version of the algorithm, for the search of an optimum sequence of templates.
Algorithm for Determining the Sequence of Templates Corresponding to the Animation
(71) In order to determine the sequence of templates, a simple algorithm is used whose input data is a text of a sentence and a set of templates able to form a most full coverage of the sentence.
(72) 1. For each template, its mean position in the sentence is calculated: an arithmetic mean of the sequential word numbers in the text. If a template covers several identical words, the first one is used (an embodiment is possible wherein an arithmetic mean of all positions is used).
(73) 2. Templates are arranged in ascending order by the mean position in the sentence.
Method of Animation of SMS Messages
(74) Two options are available for the animation of SMS messages.
(75) 1. Draw a message to your friends! A person sends an SMS message to a short number, specifying a subscriber to which the message has to be forwarded. The most appropriate animation closest to the message subject is identified and sent to said subscriber. Both a static image and animation may be sent.
(76) 2. Visualize your wishes! A person sends an SMS message containing the desired object to a short number and receives a funny picture of this object.
(77) Below is the operating sequence of an embodiment illustrated by
(78) 1. A sending subscriber 32 sends an SMS message 34 to a short number such as 9999. In the SMS message 34, he includes a receiving subscriber number (for example, 8916000001) and the message text (such as ‘Kiss you, my cutie’).
(79) 2. The message is routed through a mobile network operator's (MNO) equipment 36 represented by a tower 38 to an SMS Center (SMSC) 39 from where the data is provided to a billing system 40 for mutual settlements with service providers and the message itself is forwarded to a service provision system 42 (the use of an intermediate entity (not shown) is possible).
(80) 3. The service provision system 42 receives (44) the SMS 34 and analyzes (46) the received SMS message by identifying (48) the sender and defining the message text and defining (50) the receiving subscriber number.
(81) 4. The service provision system 42 then makes (52) a request containing the text to the animation system 10.
(82) 5. The animation system generates (selects) animation by selecting the respective data from a particular database in the animation database 16.
(83) 6. Following the animation searching and assembling algorithm 14, the selected animation templates are spliced and sent to the service provision system 42.
(84) 7. The service provision system 42 receives (54) the animation, forms (56) an MMS by packing (56) the received animation into the MMS and sends (58) it via an MMS Center (MMSC) 60 to the receiving subscriber 62.
(85) 8. The service provision system sends (64) an SMS message to the sending subscriber 32 via the SMSC 60 saying that the animation has been sent or failed if no MMS has been sent (optionally).
(86) 9. The receiving subscriber 62 receives the MMS containing the animated SMS message on behalf (from the number) of the sending subscriber 32.
(87) 10. The sending subscriber 32 receives an SMS message 66 notifying of sending (optionally).
(88) 11. The service provision system 42 also performs (68) statistical recording for analysis and billing. An owner 70 of the service provision (mobile network operator, content aggregator, etc.) is supposed to interact with an owner 72 of the animation system monetization aspects (mutual settlements with regard to animation generation services based on the number of animation service requests or a monthly fee).
(89) The operating sequence of an embodiment wherein wishes are visualized is somewhat different although the flow chart remains very similar for the participants in the process with the only exception that the receiving subscriber coincides with the sending subscriber.
(90) The sending subscriber 32 sends an SMS message to a short number. He specifies the desired object in the SMS message.
(91) 2. The message is routed through the mobile network operator's (MNO) equipment 38 to the SMS Center (SMSC) 39 from where the data is provided to the billing system 40 and the message itself is forwarded to the service provision system 42.
(92) 3. The service provision system 42 analyzes the received SMS message by identifying the sender, defining the object to be visualized and forwarding its textual description to the animation system 10.
(93) 4. The animation system 10 selects a picture/animation by selecting the respective data from the animation database 16 and sends the same to the service provision system 42.
(94) 5. The selected animation is formed into an MMS and sent via the MMS Center (MMSC) 60 to the subscriber 32.
(95) 6. The subscriber 32 receives the MMS containing the animated SMS message.
(96) The accompanying drawings illustrate the architecture, functionality and operation of the possible implementations of the systems and methods according to various embodiments of the present invention. Accordingly, each block may comprise a module, a segment or a portion of a code which contains one or more executable commands for implementing a certain logical function(s). It should be also noted that in some alternative embodiments, the functions designated in the block may be performed in an order other than that shown in the drawings. For example, two blocks shown in series may actually be executed in a reverse order depending in the enabled functionality. It should be also noted that each illustrated block and combinations of blocks in the drawings may be implemented either as specialized hardware-based systems and modules which perform specialized functions or actions, or as combinations of the specialized hardware and computer commands.
(97) The terminology used herein is for the purpose of describing particular embodiments only and is not intended to limit the invention to the particularly provided example. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising”, “includes” and/or “including” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
(98) While particular embodiments of the invention have been illustrated and described above, it will be apparent to those of ordinary skill in the art that any configuration designed for achieving the same object may be presented instead of the particularly shown embodiments and that the invention has other applications in other environments. The present disclosure is intended to encompass any adaptations or variations of the present invention. The following claims are by no means intended for limiting the scope of invention to particular embodiments described herein.