Sound control device, sound control method, and sound control program
10504502 ยท 2019-12-10
Assignee
Inventors
Cpc classification
G10H2220/005
PHYSICS
G10H2220/285
PHYSICS
G10H7/008
PHYSICS
G10L13/08
PHYSICS
G10L13/04
PHYSICS
G10L13/027
PHYSICS
G10L13/06
PHYSICS
International classification
G10L13/06
PHYSICS
G10L13/08
PHYSICS
G10L13/027
PHYSICS
Abstract
A sound control device includes: a detection unit that detects a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; and a control unit that causes output of a second sound to be started, in response to the second operation being detected. The control unit causes output of a first sound to be started before causing the output of the second sound to be started, in response to the first operation being detected.
Claims
1. A sound control device comprising: a storage unit that stores syllable information about a syllable that: in a case where the syllable is composed of only a vowel sound, starts with the vowel sound; and in a case where the syllable is composed of a consonant sound and the vowel sound, starts with the consonant sound and continues with the vowel sound after the consonant sound; a detection unit that detects a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; and a control unit that: causes output of the consonant sound of the syllable to be started in response to the first operation being detected; causes output of the vowel sound of the syllable to be started, after causing the output of the consonant sound of the syllable to be started, in response to the second operation being detected; reads the syllable information from the storage unit and determines whether the syllable starts with the consonant sound or the vowel sound; determines that the consonant sound is to be output in a case where the control unit determines that the syllable starts with the consonant sound; and determines that the consonant sound is to be not output in a case where the control unit determines that the syllable starts with the vowel sound.
2. The sound control device according to claim 1, wherein: the operator accepts push-in by a user, the detection unit detects: as the first operation, that the operator has been pushed in by a first distance from a reference position; and as the second operation, that the operator has been pushed in by a second distance from the reference position, the second distance being longer than the first distance.
3. The sound control device according to claim 1, wherein: the detection unit comprises first and second sensors provided in the operator, the first sensor detects the first operation, and the second sensor detects the second operation.
4. The sound control device according to claim 1, wherein the operator comprises a keyboard that accepts the first and second operations.
5. The sound control device according to claim 1, wherein the operator comprises a touch panel that accepts the first and second operations.
6. The sound control device according to claim 1, wherein: the operator is associated with a pitch, and the control unit causes the consonant and vowel sounds of the syllable to be output at the pitch.
7. The sound control device according to claim 1, wherein: the operator comprises a plurality of operators associated with a plurality of mutually different pitches, respectively, the detection unit detects the first and second operations on an arbitrary one operator among the plurality of operators, and the control unit causes the consonant and vowel sounds of the syllable to be output at a pitch associated with the one operator.
8. The sound control device according to claim 1, wherein the control unit controls a timing at which output of the consonant sound of the syllable is started according to a type of the consonant sound.
9. The sound control device according to claim 1, wherein the control unit, in a case where the syllable is composed of the consonant sound and the vowel sound: causes the consonant sound of the syllable to be output; and causes the vowel sound of the syllable to be output.
10. The sound control device according to claim 1, wherein, in a case where the syllable is composed of the consonant sound and the vowel sound: the vowel sound of the syllable follows the consonant sound of the syllable, and the vowel sound of the syllable comprises a speech element corresponding to a change from the consonant sound to the vowel sound.
11. The sound control device according to claim 10, wherein, in a case where the syllable is composed of the consonant sound and the vowel sound, the vowel sound of the syllable further comprises a speech element corresponding to continuation of the vowel sound.
12. The sound control device according to claim 1, wherein the syllable is a single character or a single Japanese kana.
13. A sound control method comprising: storing, in a storage unit, syllable information about a syllable that: in a case where the syllable is composed only of a vowel sound, starts with the vowel sound; and in a case where the syllable is composed of a consonant sound and the vowel sound, starts with the consonant sound and continues with the vowel sound after the consonant sound; detecting a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; causing output of the consonant sound of the syllable to be started in response to the first operation being detected; causing output of the vowel sound of the syllable to be started, after causing the output of the consonant sound of the syllable to be started, in response to the second operation being detected; reading the syllable information from the storage unit and determining whether the syllable starts with the consonant sound or the vowel sound; determining that the consonant sound is to be output in a case where the syllable is determined to start with the consonant sound; and determining that the consonant sound is to be not output in a case where the syllable is determined to start with the vowel sound.
14. A non-transitory computer-readable recording medium storing a program executable by a computer to execute a method comprising: storing, in a storage unit, syllable information about a syllable that: in a case where the syllable is composed only of a vowel sound, starts with the vowel sound; and in a case where the syllable is composed of a consonant sound and the vowel sound, starts with the consonant sound and continues with the vowel sound after the consonant sound; detecting a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; causing output of the consonant sound of the syllable to be started in response to the first operation being detected; causing output of the vowel sound of the syllable to be started, after causing the output of the consonant sound of the syllable to be started, in response to the second operation being detected; reading the syllable information from the storage unit and determining whether the syllable starts with the consonant sound or the vowel sound; determining that the consonant sound is to be output in a case where the syllable is determined to start with the consonant sound; and determining that the consonant sound is to be not output in a case where the syllable is determined to start with the vowel sound.
15. A sound control device comprising: a storage unit that stores a syllable information table in which a type of a consonant sound and a timing at which output of the consonant sound is started are associated, wherein the consonant sound and a vowel sound constitute a single syllable; a detection unit that detects a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; a control unit that: causes output of the consonant sound of the single syllable to be started in response to the first operation being detected; causes output of the vowel sound of the single syllable to be started, after causing the output of the consonant sound of the single syllable to be started, in response to the second operation being detected; reads the syllable information table from the storage unit; acquires the timing associated with the type of the consonant sound of the single syllable by referring to the read syllable information table; and causes output of the consonant sound of the single syllable to be started at the acquired timing.
16. The sound control device according to claim 15, wherein: the storage unit further stores syllable information about the single syllable, the single syllable starts with the consonant sound and continues with the vowel sound after the consonant sound, the control unit: reads the syllable information from the storage unit; causes the consonant sound of the syllable to be output; and causes the vowel sound of the syllable to be output.
17. The sound control device according to claim 15, wherein: the vowel sound follows the consonant sound in the single syllable, and the vowel sound of the single syllable comprises a speech element corresponding to a change from the consonant sound to the vowel sound.
18. The sound control device according to claim 17, wherein the vowel sound of the single syllable further comprises a speech element corresponding to continuation of the vowel sound.
19. The sound control device according to claim 15, wherein the single syllable is a single character or a single Japanese kana.
20. A sound control method comprising: storing, in a storage unit, a syllable information table in which a type of a consonant sound and a timing at which output of the consonant sound is started are associated, wherein the consonant sound and a vowel sound constitute a single syllable; detecting a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; causing output of the consonant sound of the single syllable to be started in response to the first operation being detected; causing output of the vowel sound of the single syllable to be started, after causing the output of the consonant sound of the single syllable to be started, in response to the second operation being detected; reading the syllable information table stored in the storage unit; and acquiring the timing associated with the type of the consonant sound of the single syllable by referring to the read syllable information table, wherein the causing of the output of the consonant sound comprises causing output of the consonant sound of the single syllable to be started at the acquired timing.
21. A non-transitory computer-readable recording medium storing a program executable by a computer to execute a method comprising: storing, in a storage unit, a syllable information table in which a type of a consonant sound and a timing at which output of the consonant sound is started are associated, wherein the consonant sound and a vowel sound constitute a single syllable; detecting a first operation on an operator and a second operation on the operator, the second operation being performed after the first operation; causing output of the consonant sound of the single syllable to be started in response to the first operation being detected; causing output of the vowel sound of the single syllable to be started, after causing the output of the consonant sound of the single syllable to be started, in response to the second operation being detected; reading the syllable information table stored in the storage unit; and acquiring the timing associated with the type of the consonant sound of the single syllable by referring to the read syllable information table, wherein the causing of the output of the consonant sound comprises causing output of the consonant sound of the single syllable to be started at the acquired timing.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
EMBODIMENTS FOR CARRYING OUT THE INVENTION
(13)
(14) A singing sound generating apparatus 1 according to the embodiment of the present invention shown in
(15) A sound control device may correspond to the singing sound generating apparatus 1. A detection unit, a control unit, an operator, and a storage unit of this sound control device, may each correspond to at least one of these configurations of the singing sound generating apparatus 1. For example, the detection unit may correspond to at least one of the CPU 10 and the performance operator 16. The control unit may correspond to at least one of the CPU 10, the sound source 13, and the sound system 14. The storage unit may correspond to the data memory 18.
(16) The CPU 10 is a central processing unit that controls the whole singing sound generating apparatus 1 according to the embodiment of the present invention. The ROM 11 is a nonvolatile memory in which a control program and various data are stored. The RAM 12 is a volatile memory used for a work area of the CPU 10 and the various buffers. The data memory 18 stores a syllable information table including text data of lyrics, and a phoneme database storing speech element data of a singing sound, and the like. The display unit 15 is a display unit including a liquid crystal display or the like on which the operating state and various setting screens and messages to the user are displayed. The performance operator 16 is an operator for a performance, such as a keyboard, and includes a plurality of sensors that detect operation of the operator in a plurality of stages. The performance operator 16 generates performance information such as key-on and key-off, pitch, and velocity based on the on/off of the plurality of sensors. This performance information may be performance information of a MIDI (musical instrument digital interface) message. The setting operator 17 is various setting operation elements such as operation knobs and operation buttons for setting the singing sound generating apparatus 1.
(17) The sound source 13 has a plurality of sound generation channels. Under the control of the CPU 10, one sound generation channel is allocated to the sound source 13 according to the real-time performance of a user using the performance operator 16. The sound source 13 reads out the speech element data corresponding to the performance from the data memory 18, in the allocated sound generation channel, and generates singing sound data. The sound system 14 converts the singing sound data generated by the sound source 13 into an analog signal by a digital/analog converter, amplifies the singing sound that is made into an analog signal, and outputs it to a speaker or the like. Further, the bus 19 is a bus for transferring data between each unit of the singing sound generating apparatus 1.
(18) The singing sound generating apparatus 1 according to the embodiment of the present invention will be described below. Here, the singing sound generating apparatus 1 will be described by taking as an example a case where a keyboard 40 is provided as the performance operator 16. In the keyboard 40 which is the performance operator 16, there is provided an operation detection unit 41 including a first sensor 41a, a second sensor 41b, and a third sensor 41c, which detects a push-in operation of the keyboard in multiple stages (refer to part (a) of
(19) In the singing sound generating apparatus 1 shown in these figures, when the user performs in real-time, the performance is performed by a push-in operation of the keyboard which is the performance operator 16. As shown in part (a) of
(20) The performance processing shown in
(21) The designated lyrics are delimited for each syllable. In step S10 of the performance processing, syllable information acquisition processing that acquires syllable information representing the first syllable of the lyrics is performed. The syllable information acquisition processing is executed by the CPU 10, and a flowchart showing the details thereof is shown in
(22) The speech element data selection processing of step S11 is processing performed by the sound source 13 under the control of the CPU 10. The sound source 13 selects, from a phoneme database 32 shown in
(23) In step S13, the sound source 13 performs sound generation processing based on the speech element data selected in step S11 under the control of the CPU 10. A flowchart showing the details of sound generation processing is shown in
(24) The operation of this performance processing is shown in
(25) By returning to step S10 in the performance processing, the CPU 10 reads ru which is the second syllable c2 on which the cursor of the designated lyrics is placed, from the data memory 18 in the syllable information acquisition processing of step S10. The CPU 10 determines that the syllable ru starts with the consonant r and determines that the consonant r is to be output. Also, the CPU 10 refers to the syllable information table 31 shown in
(26) When the keyboard 40 is operated as the real-time performance progresses, and as the second depression it is detected that the first sensor 41a of the key is turned on, a sound generation instruction of a second key-on n2 based on the key whose first sensor 41a is turned on is accepted in step S12. This sound generation instruction acceptance processing of step S12 accepts a sound generation instruction based on the key-on n2 of the operated performance operator 16, and the CPU 10 sets the sound source 13 with the timing of the key-on n2, and pitch information indicating the pitch of E5. In the sound generation processing of step S13, the sound source 13 starts counting a sound generation timing corresponding to the set consonant sound type. In this case, since approximately 0.02 sec is set, the sound source 13 counts up after approximately 0.02 sec has elapsed, and starts sound generation of the consonant component of #-r at a sound generation timing corresponding to the consonant sound type. At the time of this sound generation, sound generation is performed at the set pitch of E5 and the predetermined volume. When it is detected that the second sensor 41b is turned on in the key corresponding to the key-on n2, sound generation of the speech element data of the vowel component of r-u.fwdarw.u is started in the sound source 13, and ru of the syllable c2 is generated. At the time of sound generation, the vowel component of r-u.fwdarw.u is generated at the pitch of E5 received at the time of acceptance of the sound generation instruction of the key-on n2, and at a volume according to the velocity corresponding to the time difference from the first sensor 41a being turned on to the second sensor 41b being turned on. As a result, sound generation of a singing sound of ru of the acquired syllable c2 is started. Further, in step S14, the CPU 10 determines whether or not all the syllables have been acquired. Here, since there is a next syllable at the position of the cursor, the CPU 10 determines that not all the syllables have been acquired, and the process once again returns to step S10.
(27) The operation of this performance processing is shown in
(28) By returning to step S10 in the performance processing, the CPU 10 reads yo which is the third syllable c3 on which the cursor of the designated lyrics is placed, from the data memory 18 in the syllable information acquisition processing of step S10. The CPU 10 determines that the syllable yo starts with the consonant y and determines that the consonant y is to be output. Also, the CPU 10 refers to the syllable information table 31 shown in
(29) When the performance operator 16 is operated as the real-time performance progresses, a sound generation instruction of a third key-on n3 based on the key whose first sensor 41a is turned on is accepted in step S12. This sound generation instruction acceptance processing of step S12 accepts a sound generation instruction based on the key-on n3 of the operated performance operator 16, and the CPU 10 sets the sound source 13 with the timing of the key-on n3, and pitch information indicating the pitch of D5. In the sound generation processing of step S13, the sound source 13 starts counting a sound generation timing corresponding to the set consonant sound type. In this case, the consonant sound type is y. Consequently, a sound generation timing corresponding to the consonant sound type y is set. Also, sound generation of the consonant component of #-y is started at the sound generation timing corresponding to the consonant sound type y. At the time of this sound generation, sound generation is performed at the set pitch of D5 and the predetermined volume. When it is detected that the second sensor 41b is turned on in the key that detected that the first sensor 41a is turned on, sound generation of the speech element data of the vowel component of y-o.fwdarw.o is started in the sound source 13, and yo of the syllable c3 is generated. At the time of sound generation, the vowel component of y-o.fwdarw.o is generated at the pitch of D5 received at the time of acceptance of the sound generation instruction of the key-on n3, and at a volume according to the velocity corresponding to the time difference from the first sensor 41a being turned on to the second sensor 41b being turned on. As a result, sound generation of a singing sound of yo of the acquired syllable c3 is started. Further, in step S14, the CPU 10 determines whether or not all the syllables have been acquired. Here, since there is a next syllable at the position of the cursor, the CPU 10 determines that not all the syllables have been acquired, and the process once again returns to step S10.
(30) By returning to step S10 in the performance processing, the CPU 10 reads ko which is the fourth syllable c41 on which the cursor of the designated lyrics is placed, from the data memory 18 in the syllable information acquisition processing of step S10. The CPU 10 determines that the syllable ko starts with the consonant k and determines that the consonant k is to be output. Also, the CPU 10 refers to the syllable information table 31 shown in
(31) When the performance operator 16 is operated as the real-time performance progresses, a sound generation instruction of a fourth key-on n4 based on the key whose first sensor 41a is turned on is accepted in step S12. This sound generation instruction acceptance processing of step S12 accepts a sound generation instruction based on the key-on n4 of the operated performance operator 16, and the CPU 10 sets the sound source 13 with the timing of the key-on n4, and the pitch information of E5. In the sound generation processing of step S13, counting of a sound generation timing corresponding to the set consonant sound type is started. In this case, since the consonant sound type is k, a sound generation timing corresponding to k is set, and sound generation of the consonant component of #-k is started at the sound generation timing corresponding to the consonant sound type k. At the time of this sound generation, sound generation is performed at the set pitch of E5 and the predetermined volume. When it is detected that the second sensor 41b is turned on in the key that detected that the first sensor 41a is turned on, sound generation of the speech element data of the vowel component of k-o.fwdarw.o is started in the sound source 13, and ko of the syllable c41 is generated. At the time of sound generation, the vowel component of y-o.fwdarw.o is generated at the pitch of E5 received at the time of acceptance of the sound generation instruction of the key-on n4, and at a volume according to the velocity corresponding to the time difference from the first sensor 41a being turned on to the second sensor 41b being turned on. As a result, sound generation of a singing sound of ko of the acquired syllable c41 is started. Further, in step S14, the CPU 10 determines whether or not all the syllables have been acquired, and here, since there is a next syllable at the position of the cursor, it determines that not all the syllables have been acquired, and the process once again returns to step S10.
(32) As a result of the performance processing returning to step S10, the CPU 10 reads i which is the fifth syllable c42 on which the cursor of the designated lyrics is placed, from the data memory 18 in the syllable information acquisition processing of step S10. Also, it refers to the syllable information table 31 shown in
(33) The case where a syllable includes a flag such that ko and i which are syllables c41 and c42, are generated with a single key-on will be described. In this case, ko which is syllable c41, is generated by the key-on n4, and i which is syllable c42, is generated when the key-on n4 is turned off. That is, in the case where the flag described above is included in the syllables c41 and c42, the same process as the speech element data selection processing of step S11 is performed when it is detected that the key-on n4 is turned off, and the sound source 13 selects from the phonemic chain data 32a, the speech element data o-i corresponding to vowel o.fwdarw.vowel i, and also selects from the stationary part data 32b, the speech element data i corresponding to vowel i. Next, the sound source 13 starts sound generation of the speech element data of the vowel component of o-i.fwdarw.i, and generates i of the syllable c41. Consequently, a singing sound of i of c42 is generated with the same pitch E5 as ko of c41 at the volume of the release curve of the envelope ENV of the singing sound of ko. In response to the key-off, a muting process of the singing sound of ko is performed, and sound generation is stopped. As a result, the sound generation becomes ko.fwdarw.i.
(34) As described above, the singing sound generating apparatus 1 according to the embodiment of the present invention starts sound generation of a consonant sound when a consonant sound generation timing is reached, referenced to the timing at which the first sensor 41a is turned on, and then starts sound generation of a vowel sound at the timing at which the second sensor 41b is turned on. Consequently, the singing sound generating apparatus 1 according to the embodiment of the present invention operates according to a key depression speed corresponding to the time difference from when the first sensor 41a is turned on to when the second sensor 41b is turned on. Therefore, the operation of three cases having different key depression speeds will be described below with reference to
(35)
(36)
(37)
(38) The sound generation length in which the sa line of the Japanese syllabary diagram sounds natural is 50 to 100 ms. In a normal performance, the key depression speed (the time taken from when the first sensor 41a is turned on to when the second sensor 41b is turned on) is approximately 20 to 100 ms. Consequently, in reality the case shown in
(39) The case where the keyboard which is a performance operator, is a three-make keyboard provided with a first sensor to a third sensor has been described. However, it is not limited to such an example. The keyboard may be a two-make keyboard provided with a first sensor and a second sensor without a third sensor.
(40) The keyboard may be a keyboard provided with a touch sensor on the surface that detects contact, and may be provided with a single switch that detects downward pressing to the interior. In this case, for example, as shown in
(41) In the example shown in
(42) For detection of an operation on the performance operator, a camera may be used in place of a touch sensor to detect contact (near-contact) of a finger of an operator on a keyboard.
(43) Processing may be carried out by recording a program for realizing the functions of the singing sound generating apparatus 1 according to the above-described embodiments, in a computer-readable recording medium, and reading the program recorded on this recording medium into a computer system, and executing the program.
(44) The computer system referred to here may include hardware such as an operating system (OS) and peripheral devices.
(45) The computer-readable recording medium may be a writable nonvolatile memory such as a flexible disk, a magneto-optical disk, a ROM (Read Only Memory), or a flash memory, a portable medium such as a DVD (Digital Versatile Disk), or a storage device such as a hard disk built into the computer system.
(46) Computer-readable recording medium also includes a medium that holds programs for a certain period of time such as a volatile memory (for example, a DRAM (Dynamic Random Access Memory)) in a computer system serving as a server or a client when a program is transmitted via a network such as the Internet or a communication line such as a telephone line.
(47) The above program may be transmitted from a computer system in which the program is stored in a storage device or the like, to another computer system via a transmission medium or by a transmission wave in a transmission medium. A transmission medium for transmitting a program means a medium having a function of transmitting information such as a network (communication network) such as the Internet and a telecommunication line (communication line) such as a telephone line.
(48) The above program may be for realizing a part of the above-described functions. The above program may be a so-called difference file (difference program) that can realize the above-described functions by a combination with a program already recorded in the computer system.