Music production using recorded hums and taps
10431192 ยท 2019-10-01
Assignee
Inventors
Cpc classification
G10H1/0025
PHYSICS
G10H2230/015
PHYSICS
G10H2210/086
PHYSICS
International classification
Abstract
Embodiments of the present invention provide for the composition of new music based on analysis of unprocessed audio, which may be in the form of melodic hums and rhythmic taps. As a result of this analysismusic information retrieval or MIRmusical features such as pitch and tempo are output. These musical features are then used by a composition engine to generate a new and socially co-created piece of content represented as an abstraction. This abstraction is then used by a production engine to produce audio files that may be played back, shared, or further manipulated.
Claims
1. A method for producing music based on unprocessed audio, the method comprising: receiving a musical blueprint input file reflective of melodic hums and rhythmic taps recorded in an audible analog domain at a microphone of a user device and converted to a digital domain; identifying a melody in a symbolic layer associated with the musical blueprint input file, wherein the identified melody is relative to one or more identified points within the musical blueprint input file; rendering music via instrumentation for one or more instruments based on the identified melody; and mixing the instrumentation for the one or more instruments, wherein a final mix track file is generated.
2. The method of claim 1, wherein the symbolic layer comprises one or more encoded tuples each representing extracted musical elements.
3. The method of claim 1, wherein the musical blueprint input file further comprises an abstraction layer.
4. The method of claim 1, further comprising correlating the symbolic layer to an arrangement model comprising a dictionary of musical style functions.
5. The method of claim 4, wherein correlating the symbolic layer to an arrangement model comprises applying at least one feature of the arrangement model, wherein the at least one feature is selected from chord progression, instrumentation, eastern tonality, and western tonality.
6. The method of claim 1, further comprising aligning the melodic hums and rhythmic taps relative to the identified points within the musical blueprint input file.
7. The method of claim 1, further comprising generating a map of the one or more identified points within the musical blueprint input file.
8. The method of claim 1, further comprising applying at least one correction or normalization of the musical blueprint input file prior to rendering.
9. The method of claim 1, further comprising transferring the final mix track file to a data storage device.
10. A system for producing music based on unprocessed audio, the method comprising: a user device comprising a microphone that records melodic hums and rhythmic taps in an audible analog domain; and a server that converts the recorded melodic hums and rhythmic taps to a musical blueprint input file in a digital domain; identifies a melody in a symbolic layer associated with the musical blueprint input file, wherein the identified melody is relative to one or more identified points within the musical blueprint input file; renders music via instrumentation for one or more instruments based on the identified melody; and mixes the instrumentation for the one or more instruments, wherein a final mix track file is generated.
11. The system of claim 10, wherein the symbolic layer comprises one or more encoded tuples each representing extracted musical elements.
12. The system of claim 10, wherein the musical blueprint input file further comprises an abstraction layer.
13. The system of claim 10, wherein the server further correlates the symbolic layer to an arrangement model comprising a dictionary of musical style functions.
14. The system of claim 13, wherein the server correlates the symbolic layer to an arrangement model by applying at least one feature of the arrangement model, wherein the at least one feature is selected from chord progression, instrumentation, eastern tonality, and western tonality.
15. The system of claim 10, wherein the server further aligns the melodic hums and rhythmic taps relative to the identified points within the musical blueprint input file.
16. The system of claim 10, wherein the server further generates a map of the one or more identified points within the musical blueprint input file.
17. The system of claim 10, wherein the server further applies at least one correction or normalization of the musical blueprint input file prior to rendering.
18. The system of claim 10, wherein the server further transfers the final mix track file to a data storage device.
19. A non-transitory computer-readable storage medium, having embodied thereon a program executable by a processor to perform a method for producing music based on unprocessed audio, the method comprising: receiving a musical blueprint input file reflective of melodic hums and rhythmic taps recorded in an audible analog domain at a microphone of a user device and converted to a digital domain; identifying a melody in a symbolic layer associated with the musical blueprint input file, wherein the identified melody is relative to one or more identified points within the musical blueprint input file; rendering music via instrumentation for one or more instruments based on the identified melody; and mixing the instrumentation for the one or more instruments, wherein a final mix track file is generated.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
DETAILED DESCRIPTION
(4) Embodiments of the present invention provide for the composition of new music based on analysis of unprocessed audio, which may be in the form of melodic hums and rhythmic taps. As a result of this analysismusic information retrieval or MIRmusical features such as pitch and tempo are output. These musical features are then used by a composition engine to generate a new and socially co-created piece of content represented as an abstraction. This abstraction is then used by a production engine to produce audio files that may be played back, shared, or further manipulated.
(5)
(6) For example, hardware device 100 may be utilized to implement musical information retrieval. Hardware device 100 might also be used for composition and production. Composition, production, and rendering may occur on a separate hardware device 100 or could be implemented as a part of a single device 100.
(7) Hardware device 100 as illustrated in
(8) The aforementioned components of
(9) Mass storage 130 may be implemented as tape libraries, RAID systems, hard disk drives, solid-state drives, magnetic tape drives, optical disk drives, and magneto-optical disc drives. Mass storage 130 is non-volatile in nature such that it does not lose its contents should power be discontinued. As noted above, mass storage 130 is non-transitory in nature although the data and information maintained in mass storage 130 may be received or transmitted utilizing various transitory methodologies. Information and data maintained in mass storage 130 may be utilized by processor 110 or generated as a result of a processing operation by processor 110. Mass storage 130 may store various software components necessary for implementing one or more embodiments of the present invention by loading various modules, instructions, or other data components into memory 120.
(10) Portable storage 140 is inclusive of any non-volatile storage device that may be introduced to and removed from hardware device 100. Such introduction may occur through one or more communications ports, including but not limited to serial, USB, Fire Wire, Thunderbolt, or Lightning. While portable storage 140 serves a similar purpose as mass storage 130, mass storage device 130 is envisioned as being a permanent or near-permanent component of the device 100 and not intended for regular removal. Like mass storage device 130, portable storage device 140 may allow for the introduction of various modules, instructions, or other data components into memory 120.
(11) Input devices 150 provide one or more portions of a user interface and are inclusive of keyboards, pointing devices such as a mouse, a trackball, stylus, or other directional control mechanism. Various virtual reality or augmented reality devices may likewise serve as input device 150. Input devices may be communicatively coupled to the hardware device 100 utilizing one or more the exemplary communications ports described above in the context of portable storage 140.
(12)
(13) Display system 170 is any output device for presentation of information in visual or occasionally tactile form (e.g., for those with visual impairments). Display devices include but are not limited to plasma display panels (PDPs), liquid crystal displays (LCDs), and organic light-emitting diode displays (OLEDs). Other displays systems 170 may include surface conduction electron emitters (SEDs), laser TV, carbon nanotubes, quantum dot displays, and interferometric modulator displays (MODs). Display system 170 may likewise encompass virtual or augmented reality devices.
(14) Peripherals 180 are inclusive of the universe of computer support devices that might otherwise add additional functionality to hardware device 100 and not otherwise specifically addressed above. For example, peripheral device 180 may include a modem, wireless router, or otherwise network interface controller. Other types of peripherals 180 might include webcams, image scanners, or microphones although the foregoing might in some instances be considered an input device.
(15) Prior to undertaking the steps discussed in
(16) The aforementioned music retrieval operation involves receiving a melodic or rhythmic contribution at a microphone or other audio receiving device and transmitting that information to a computing device like hardware device 100 of
(17) Upon receipt of the melodic musical contribution, hardware device 100 executes software to extract various elements of musical information from the melodic utterance. This information might include, but is not limited to, pitch, duration, velocity, volume, onsets and offsets, beat, and timbre. The extracted information is encoded into a symbolic layer.
(18) Music information retrieval may operate in a similar fashion with respect to receipt of a tap or other rhythmic contribution at a microphone or audio receiving device operation in conjunction with a client application that provides for the transmission of information to a computing device like hardware device 100 of
(19) Extracted musical information is reflected as a tuple in the symbolic layer. Tuples are ordered lists of elements with an n-tuple representing a sequence of n elements with n being a non-negative integeras used in relation to the semantic web. Tuples are usually written by listing elements within parenthesis and separate by commas (e.g., (2, 7, 4, 1, 7)).
(20) By encoding extracted musical information into the symbolic layer, audio information may be flexibly manipulated as it transitions from the audible analog domain to the digital data domain and back as a newly composed, produced, and rendered piece of musical content. The symbolic layer is MIDI-like in nature in that MIDI (Musical Instrument Digital Interface) allows for electronic musical instruments and computing devices to communicate with one another by using event messages to specify notation, pitch, and velocity; control parameters corresponding to volume and vibrato; and clock signals that synchronize tempo.
(21) The symbolic layer operates as sheet music. Through use of this symbolic layer, other software modules and processing routines, including those operating as a part of a composition engine, are able to utilize retrieved musical information for the purpose of applying compositional grammar rules. These rules operate to filter and adjust the musical contributions and corresponding features to deduce intent in a manner similar to natural language processing. An end result of the execution of the composition engine against the extracted feature data is a musical blueprint.
(22)
(23) Prior to validation, at step 215, an arrangement model may be referenced to correlate the symbolic layer to a dictionary of functions for various musical styles. This may include various aspects of chord progression, instrumentation, eastern versus western tonality, and other information that will drive, constrain, or otherwise influence the building of the musical blueprint, especially during the derivation of intent operation at step 230. Various fundamentals of music theory are introduced during this operation.
(24) Abstraction layer information is validated at step 220 to determine if the context includes within a reasonable range or otherwise meets basic musical assertions. For example, melodic data or rhythmic data could be presented as pure white noise and might generate some extractable features. That small subset of features would not, however, likely meet a basic definition of a musical contribution. If validation evidences that the symbolic layer is not indicative of musical content, then composition engine will not attempt to further process and develop a musical blueprint for the same. If the symbolic layer meets some basic assertions associated with musical content, then the composition operation continues.
(25) At step 230, an effort is made to derive the intent of the musical contribution and, more specifically, its extracted musical features as represented in the symbolic layer. Deriving the intent of the music generally means to derive the intended melodies and rhythms from extracted features in the MIR data and, potentially, data in a user profile (e.g., previously indicated preferences or affirmatively derived preferences). To identify the intent and prepare the symbolic layer for further production, a quantization process takes raw data and intelligently maps the same into a hierarchical structure of music. The preparation step further involves identification of empirical points in the extracted features, for example, those having the most metrical weight.
(26) At step 240, a seamless loop point is identified in the input file representing the symbolic layer. This loop point is used as a reference point for identifying the likes of chord progressions at step 250. The melody is, also at step 260, reduced to a fundamental skeletal melody based on the likes of harmonic tendencies and calculation of chord progressions. Skeletal melodies are representative of certain activity at, above, or below an emphasized point. The skeletal melody identification process is dynamic and based on runtime input.
(27) Rhythmic patterns are introduced at step 270 on the basis of extracted feature data for taps or rhythmic musical contributions. Adjustments are made at step 280 to align hums and taps (melody and rhythm), which may involve various timing information including but not limited to the aforementioned loop point. Step 290 involves the application of supporting chords and bass as might be appropriate in light of a particular musical style or genre.
(28) Corrections and normalization occur at step 295 before the completed blueprint is delivered for production and rendering as discussed in the context of
(29)
(30) The production process of
(31) The foregoing detailed description has been presented for purposes of illustration and description. The foregoing description is not intended to be exhaustive or to the present invention to the precise form disclosed. Many modifications and variations of the present invention are possible in light of the above description. The embodiments described were chosen in order to best explain the principles of the invention and its practical application to allow others of ordinary skill in the art to best make and use the same. The specific scope of the invention shall be limited by the claims appended hereto.